<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Kafka on CrippledMind's InfoSec Journal</title><link>https://crippledmind-infosec-journal.netlify.app/tags/kafka/</link><description>Recent content in Kafka on CrippledMind's InfoSec Journal</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Sun, 16 Jun 2024 14:55:54 +0530</lastBuildDate><atom:link href="https://crippledmind-infosec-journal.netlify.app/tags/kafka/index.xml" rel="self" type="application/rss+xml"/><item><title>SecureNet</title><link>https://crippledmind-infosec-journal.netlify.app/posts/projects/securenet/</link><pubDate>Sun, 16 Jun 2024 14:55:54 +0530</pubDate><guid>https://crippledmind-infosec-journal.netlify.app/posts/projects/securenet/</guid><description>&lt;h1 id="securenet-project">SecureNet Project
&lt;/h1>&lt;h2 id="overview">Overview
&lt;/h2>&lt;p>&lt;a class="link" href="https://github.com/VikasShavi/SecureNet" target="_blank" rel="noopener"
>SecureNet&lt;/a> is a comprehensive network security project that leverages a Network Intrusion Detection System (NIDS) to enhance the security of networks. The project involves data preprocessing, feature selection, machine learning-based log classification, and a Streamlit dashboard for insightful visualization of key metrics.&lt;/p>
&lt;h2 id="workflow">Workflow
&lt;/h2>&lt;h3 id="1-data-collection-and-preprocessing">1. Data Collection and Preprocessing
&lt;/h3>&lt;p>The project begins with the collection of network logs, which are sent to a Kafka topic named &amp;ldquo;logs&amp;rdquo; for initial preprocessing. The first Python file handles this task, preparing the data for feature selection.&lt;/p>
&lt;h3 id="2-feature-selection-and-further-preprocessing">2. Feature Selection and Further Preprocessing
&lt;/h3>&lt;p>A second Python file retrieves the preprocessed data from the &amp;ldquo;logs&amp;rdquo; Kafka topic, performs additional preprocessing, and sends the refined data to another Kafka topic named &amp;ldquo;logsprocessed.&amp;rdquo;&lt;/p>
&lt;h3 id="3-machine-learning-based-log-classification">3. Machine Learning-Based Log Classification
&lt;/h3>&lt;p>The third Python file retrieves data from the &amp;ldquo;logsprocessed&amp;rdquo; Kafka topic. It passes the logs through a trained machine learning model to classify them into categories: Background, Normal, or Botnet. The results are then sent to the &amp;ldquo;logslabelled&amp;rdquo; Kafka topic.&lt;/p>
&lt;h3 id="4-data-storage-with-apache-pinot">4. Data Storage with Apache Pinot
&lt;/h3>&lt;p>Apache Pinot acts as a consumer, ingesting data from the &amp;ldquo;logslabelled&amp;rdquo; Kafka topic and storing it in a database. This ensures efficient storage and retrieval of labeled log data.&lt;/p>
&lt;h3 id="5-streamlit-dashboard">5. Streamlit Dashboard
&lt;/h3>&lt;p>The final component is a Streamlit dashboard that fetches data from Apache Pinot. The dashboard displays key metrics and insights derived from the labeled log data. This visualization aids in better defending against network attacks by providing a real-time overview of network security.&lt;/p>
&lt;h2 id="getting-started">Getting Started
&lt;/h2>&lt;p>To set up and run the SecurNet project, follow these steps:&lt;/p>
&lt;ul>
&lt;li>Clone the repository:&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-lua" data-lang="lua">&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">git&lt;/span> &lt;span class="n">clone&lt;/span> &lt;span class="n">https&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="o">//&lt;/span>&lt;span class="n">github.com&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">yourusername&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">SecurNet.git&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">cd&lt;/span> &lt;span class="n">SecurNet&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ul>
&lt;li>Then download the files required from here, &lt;a class="link" href="https://iitgoffice-my.sharepoint.com/:f:/g/personal/v_shavi_iitg_ac_in/EipXYnDl2_VOm3tJh03EbswBaUhZxTKsDsLQEB9q40NKrg?e=wKbOVs" target="_blank" rel="noopener"
>LINK&lt;/a> and move it to the SecureNet folder.&lt;/li>
&lt;/ul>
&lt;h2 id="model-training">Model Training
&lt;/h2>&lt;p>The preproccessing.py file cleans and makes the raw log data ready for training. It outputs prepro.csv file. This processed log data is used by MLmodeltraining.py file to train the model.&lt;/p>
&lt;ul>
&lt;li>First run the preprocessing.py file&lt;/li>
&lt;li>It will generate a csv file in folder named outprepro.&lt;/li>
&lt;li>Change the name of the csv file to prepro.csv&lt;/li>
&lt;li>Now run the MLmodeltraining.py file. This will save the model in Model folder ready to be used.&lt;/li>
&lt;/ul>
&lt;h2 id="network-intrusion-detection">Network Intrusion Detection
&lt;/h2>&lt;p>Here we will simulate log data coming in realtime. I am reading a csv file of raw log data and sending it in chunks of 10 rows to Kafka.
Flow of the log data can be seen below:
&lt;img src="https://crippledmind-infosec-journal.netlify.app/posts/projects/securenet/path.jpeg"
width="1280"
height="277"
srcset="https://crippledmind-infosec-journal.netlify.app/posts/projects/securenet/path_hubcca10d22de4111a6ec221ed211b1a02_51101_480x0_resize_q75_box.jpeg 480w, https://crippledmind-infosec-journal.netlify.app/posts/projects/securenet/path_hubcca10d22de4111a6ec221ed211b1a02_51101_1024x0_resize_q75_box.jpeg 1024w"
loading="lazy"
alt="Screenshot 2023-12-01 at 18 57 38-PhotoRoom png-PhotoRoom"
class="gallery-image"
data-flex-grow="462"
data-flex-basis="1109px"
>
Running the project, follow the steps below,
NOTE: Run all the individual commands in a separate terminal.&lt;/p>
&lt;ol>
&lt;li>Run Apache zookeeper and kafka in different terminals one after the other by following commnads:&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-lua" data-lang="lua">&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">zookeeper&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">server&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">start&lt;/span> &lt;span class="o">/&lt;/span>&lt;span class="n">opt&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">homebrew&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">etc&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">zookeeper&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">zoo.cfg&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">kafka&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">server&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">start&lt;/span> &lt;span class="o">/&lt;/span>&lt;span class="n">opt&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">homebrew&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">etc&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">kafka&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">server.properties&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ol start="2">
&lt;li>Create Kafka topics, &amp;ldquo;logs&amp;rdquo;, &amp;ldquo;logsprocessed&amp;rdquo; and logslabelled&amp;quot;&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">kafka&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">topics&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">create&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">topic&lt;/span> &lt;span class="n">logs&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">bootstrap&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">server&lt;/span> &lt;span class="n">localhost&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">9092&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">kafka&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">topics&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">create&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">topic&lt;/span> &lt;span class="n">logsprocessed&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">bootstrap&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">server&lt;/span> &lt;span class="n">localhost&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">9092&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">kafka&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">topics&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">create&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">topic&lt;/span> &lt;span class="n">logslabelled&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">bootstrap&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">server&lt;/span> &lt;span class="n">localhost&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">9092&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ol start="3">
&lt;li>Start Apache Pinot Controller, Broker and Server&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-lua" data-lang="lua">&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">pinot&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">admin&lt;/span> &lt;span class="n">StartController&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">zkAddress&lt;/span> &lt;span class="n">localhost&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">2181&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">clusterName&lt;/span> &lt;span class="n">PinotCluster&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">controllerPort&lt;/span> &lt;span class="mi">9001&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">pinot&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">admin&lt;/span> &lt;span class="n">StartBroker&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">zkAddress&lt;/span> &lt;span class="n">localhost&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">2181&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">clusterName&lt;/span> &lt;span class="n">PinotCluster&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">brokerPort&lt;/span> &lt;span class="mi">7001&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">pinot&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">admin&lt;/span> &lt;span class="n">StartServer&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">zkAddress&lt;/span> &lt;span class="n">localhost&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">2181&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">clusterName&lt;/span> &lt;span class="n">PinotCluster&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">serverPort&lt;/span> &lt;span class="mi">8001&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">serverAdminPort&lt;/span> &lt;span class="mi">8011&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ol start="4">
&lt;li>Send the table schema and table config to Apache Pinot.&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-lua" data-lang="lua">&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">pinot&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">admin&lt;/span> &lt;span class="n">AddTable&lt;/span> &lt;span class="err">\&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">-&lt;/span>&lt;span class="n">schemaFile&lt;/span> &lt;span class="n">files_config&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">transcript_schema.json&lt;/span> &lt;span class="err">\&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">-&lt;/span>&lt;span class="n">tableConfigFile&lt;/span> &lt;span class="n">files_config&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">transcript_table_realtime.json&lt;/span> &lt;span class="err">\&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="o">-&lt;/span>&lt;span class="n">controllerPort&lt;/span> &lt;span class="mi">9001&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">exec&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ol start="6">
&lt;li>
&lt;p>Start 0.py, 1.py, 2.py in three separate terminals one after the other&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Open the apache pinot dashboard to see data ingesting &amp;mdash;-&amp;gt; &lt;a class="link" href="http://localhost:9001" target="_blank" rel="noopener"
>Link&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Run streamlit app to see the dashboard&lt;/p>
&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-lua" data-lang="lua">&lt;span class="line">&lt;span class="cl">&lt;span class="err">❯&lt;/span> &lt;span class="n">streamlit&lt;/span> &lt;span class="n">run&lt;/span> &lt;span class="n">app.py&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="screenshot">Screenshot
&lt;/h2>&lt;p>This is how your dashboard will look like&amp;hellip;😁
&lt;img src="https://crippledmind-infosec-journal.netlify.app/posts/projects/securenet/design.png"
width="3360"
height="7296"
srcset="https://crippledmind-infosec-journal.netlify.app/posts/projects/securenet/design_hud254addf9de37335e242818218b88a91_1474565_480x0_resize_box_3.png 480w, https://crippledmind-infosec-journal.netlify.app/posts/projects/securenet/design_hud254addf9de37335e242818218b88a91_1474565_1024x0_resize_box_3.png 1024w"
loading="lazy"
alt="screenshot (6)"
class="gallery-image"
data-flex-grow="46"
data-flex-basis="110px"
>&lt;/p></description></item></channel></rss>