ISSA Monthly Meeting: 1 August 2023:


WELCOME/AGENDA:

Roops opens meeting and welcomes attendees to July ISSA meeting of 2023; Admin: in person-Zoom, after meeting give feedback; what we can improve! (roopchan@gmail.com)

MEMBERSHIP BENEFITS:

Professional, stay current, hear speakers, professional development, CPE/COU, best practice/practical solutions, employment opportunities. Three people have gotten jobs from our last social! Great to know those connections are being made. Career information and Employment opportunities
Professional and student rates for membership for membership, student discount for membership, reach out to Charles. (charlesissahr@gmail.com)

NEW MEMBERS:
Welcome to all new members!


EDUCATION:

provide resources, mentorship program, reach out if interested, team building, collaboration industry tool familiarization
Chapter links on website! Resources and reading list

Comprehensive list on website, (https://issa-hr.org/security-resources) may have costs, NOT ISSA relevant, lots have used these resources to progress their careers. More detailed slides if interested.

Entry-Level Certification Opportunity: ISC2 FREE certification offered by ISC2, Certified in Cybersecurity one million program.

VOLUNTEER OPPORTUNITY:

Annual COVA CCI Cybersecurity education and research Conference at ODU.
NEED VOLUNTEERS for ISSA table Wed Sep 20, Thurs Sep 21. Tentative social is on 20th of September.(Same day) NEED PLANNING LEAD. Great opportunity (do not overcommit yourself though)

New Social Media Resources: (BRITTANY O): New Discord, open for everyone, will be member area the rest is public, finally on LinkedIn!

MEETING NOTES DEMO ON SITE:

Faith W puts them on the site, very detailed notes if you did miss presentation, you can read recap.


UPCOMING MEETINGS AND SOCIAL EVENTS:

August 1( This meeting): Charles Herring, WitFoo CTO
September 12th: Johnnie Shubert (Digital Deception: Exposing the dark side of Artificial Intelligence)
October 3rd: Adam Shostack, Shostack & Associates (Threat Modeling in the age of AI)
November 7th: John Bos, Cybrex LLC Founder/CEO: Discussion: Moden Contracting Ownership/Management (DRAFT TOPIC TITLE)
Looking for speakers in 2024! As well as potential backups. Meeting program director point of contact is Evan L, (elarson2003@gmail.com if interested or have a purpose speaker.)

CONFERENCES:

Black Hat 2023: Las Vegas NV, August 5-10, Cost $2999-$3999, www.blackhat.com
DEF CON: Las Vegas NV, August 8-11, $295-$1295 www.defcon.org
Infosec World 2023, Lake Buena Vista FL, September 25-27, $299-$3495, www.infosecworldusa.com
SANS Virginia Beach 2023: Hilton VB, Oceanfront, Virginia Beach, VA, August 21-Sep 1, $5995-$7495 www.sans.org, offers training and certifications is why it is so $$$, almost like a Bootcamp. If interested reach out to Johnnie or another, may get a behind the scenes vantage as we have inside contacts.
BSidesNoVA 2023, George Mason University, Arlington VA, September 9-10, Fees unavailable at this time, still being worked. , www.bsidesva.org, grassroots sort of conference, will be workshops, student and professional focused. Bsides is on Slack. Fun conference, differentiates itself compared to others, Bsides vegas happens after Defcon and Black hat, hear talks you may hear nowhere else, tend to occur on weekends.
COVA CybER Con 2023, Old Dominion University, Webb Center Norfolk VA, September 20-21, $25-$500, https://covacci.org/cybercon/ This is the one we are looking for volunteers for. May be able to cover expenses for volunteers. FREE FOR STUDENTS.
Virginia Beach CyberOps 2023, Old Dominion University, Norfolk VA, October 28th, Free,  https://sites.google.com/view/oducyberops2023 One day event in Norfolk, great networking opportunity in our own back yard, will have Capture The Flag (CTF) event as well as speakers. Also grassroots, Much be like CyberForge.

CYBER SOCIAL EVENT:

August 30, 5:30 PM, good for anyone wanting to learn more about tech, held at Casual Pint side room, going to be held earlier to encourage more turnout. More informal than other meetings (such as the one taking place), This event does not earn CPES—is INFORMAL

Networking happy hour taking place after this meeting at Plaza Degollado


HAVE A JOB/NEED A JOB:

 ISSA has a job search page (http://iz1.me/XJU31zUSeBV)
Government Job Resource (USAJOBS.gov): Federal Resume Guidebook (Author Kathryn Troutman)
Importance of elevator pitch: ~30 seconds that includes a self-introduction, a summary of what you do (Your current role and what you are doing (well!)), or any relevant experience
Explain your value or what problem you can solve, and a call to action (what you’d like to do)
Elevator Pitch Optional details: (additional ~30 Seconds) Clearance details (holding/held), preference of remote or on-site work or relocation willingness, additional education or certifications holding or held not covered in first 30 seconds or any other short details.
We can post your email in the chat during the meeting for those who may be interested)

Moving into going around the room for those who would like to practice,
NEED A JOB: Chris Parker, Clearance, ITS, approved for skillbridge, 8 years looking to relocate, working also on degree.

HAVE A JOB: Evan L is a Project Manager for Millenium, looking for Pentesters, highly skilled team and customer has high expectations, you will learn a lot your first three months, after a year or two “could write your own ticket.” Requirements: four-year degree, 1-2 years’ experience, or 8 years exp in cyber.
Sec+ (Tier II) minimum, CEH, prefer OSCP those kinds of certifications to show you have some hands-on exp. TS/SCI preferred, can get away with Secret. Full time contract: Contact Evan L.

PRESENTATION:

Charles Herring, CEO OF WitFoo. Prior Navy, has been in business quite a while, has spoken with us multiple times and has flown in from Chicago! (glad to have him)


Big Data Security: Cruising on a Data Security Lake, Solving Big Data Challenges in SecOps:

About Charles: US Navy, F/A-18 Legacy Hornet Avionics (1995-2002)
US Navy Cybersecurity (2002-2005)
InfoWorld Test Center (2003-2008)
DoD Security & Data Consultant (2005-2012)
Cisco & Lancope Security Architect (2012-2016)
WitFoo Co-Founder and Project Lead (2016-)

WITFOO RESEARCH:

Aims to try to connect people to work together that typically don’t. Charles is a technical person, executives making business decisions requires needing to know what’s going on, (i.e: auditors, law enforcement) when working together we can all be more collaborative and efficient.

Brewer’s CAP theorem: CAP theorem (Brewer’s Theorem) after Eric Brewer states that any distributed data store can provide only two of the following THREE guarantees:
Consistency: C Every read receives the most recent write or an error
Availability: A Every request receives a (non-error) response, without the guarantee that it contains the most recent write
Partition Tolerance: P The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes

These guarantees balanced in modern DBs:

Relational (RDBMS):

(Microsoft SQL Server, MySQL, Mari, Postgresql): Issues balancing these guarantees: Delayed Availability (C), Locking or rows and tables, Active-Passive option (CP), Columns/Fields can be indexed/can establish relationships with one another, Expensive Schema changes, Predictable memory usage, Wide support in programs and languages, Standard Query Language (SQL)
Charles: “Old school, falls apart for big data, can have active/passive SQL clusters, can have consistency, will need to wait on it cause tables are locking. Search columns and fields, build indexes, can be fast if you know what you are storing.”

NoSQL:

(Cassandra, mongoDB, elasticsearch): Issues balancing these guarantees: Delayed/Eventual Consistency (AP), Faster and scalable, Basis for Graph and Vector Databases
NoSQL stands for “not only SQL, stores data in various models such as JSON, key-value pairs, wide-column stores, and graph DBs. Typically used for Big Data solutions involving large/complex sets of data not well suited for relational DBs. NoSQL DBs are more scalable, flexible and performant than RDBMS, possibly sacrificing consistency and transactional guarantees (Bing Chat, 2023).
Charles: “Using columns and rows differently: JSON blob, Unique ID for row, Partition Key for data. All partitions and Data is in the Blob. Data preprocessing built off No SQL including Graphs and Vector DBs.

Graph:

(Built on noSQL/Tracks relationships between objects): Type of noSQL database that store data as nodes and edges which represent entities and relationships between them. Graph DBs are used for complex queries that involve traversing multiple connections or paths in the data. Graph DBs can perform analytics that RDBMS cannot do easily or efficiently, such as finding shortest paths, clustering, centrality, and recommendation systems (Bing Chat, 2023)
Charles: “Tracking the relationships of objects, computer A talking to computer B”


Vector:

 (Built on noSQL/Establishes similarities): Type of noSQL database that store data as vectors: arrays of numbers that represent features or attributes of the data. Vector DBs are used for similarity searching and ML applications that require fast and accurate retrieval of similar items based on their vector representations (Bing Chat,2023).
Charles: “The outcomes like “I want something Like something”, things that are alike. Air BNB is a good example of a vector using platform, characteristics about the node objects. Chat GPT is fueled by vectored DBs. Type of NoSQL queries of similarity.”

Data Lake:

(Storage of raw data, no processing, “Data Warehouses” that are searchable): Systems that store large amounts of current and historical data in a variety of formats such as JSON, CSV, Avro, ORC, and Parquet. Data lakes are used for analyzing raw or unstructured data to gain insights.  Data lakes can store any type of data from any source without requiring a predefined schema or transformation (Bing Chat, 2023).
Charles: “if you put it there, it’s there, but it’s not processed or organized, just dumped in the data lake, data is computationally impossible to find. Data warehousing makes sense of Data lakes.”

Predestination of Data:

The entire lifespan of a datum must be established at its birth. Comprehension of syntax, source and intent must be extracted. Interference and potential impact of the datum must be established. Nature of creation and transmission must be preserved. All expected evolutions and iterations of the data need to be established for processing. The death (TTL) of the datum must be established at persistence. Charles: “When you receive a signal, piece of info, file, need to know everything about entire life of that data the moment it is created. How is it going to be processed? Audience? Impact of the data? Public health is a good example, doctor is collecting information about health with the idea of helping to make healthier, data becomes metadata, data helps make decisions and assign resources for demographics.”

CYBERSECURITY STAKEHOLDERS:

What is the data for? Who is the data for? (Cyber Investigator, Law Enforcement, Small Business, Executive, Insurer, Auditor)
Cyber Investigator: Extreme Message Volumes/ Noise and False positives/ Diverse data sources and formats.
Law Enforcement: How to request evidence/ Submitting legal bulletins/ Explain Cyber to a jury.
Small Business: How to report to police/Collaboration with experts/Reduce cost of cyber.
Executive: Legal compliance/Cyber ROI/ Cyber risk level.
Insurer: Underwrite cyber risk/ Adjust cyber claims/ Coordinate with law enforcement.
Auditor: Legal compliance/ Continuous monitoring/ Objective, Data-driven findings.

Charles: “What I’ve learned from Naval aviation and how it impacted his knowledge in cyber; data and maintenance records have same principal applications with aircraft.
Thinking about what are the outcomes of the data, focus on incident responder: give them analytics that allows them to do their work. (Should be no-brainer), think about outcome, if there is a crime, need to report it. Police need to communicate with prosecutor, prosecutor deals with jury, jury had to deal with deliberation. Small businesses don’t have the same resources larger organizations have, large enterprises are different form a café down the street, (NO SOC, NO IT GUY), want to benefit those important to us in the community, the executives: concerned about P&L statements, every other business does that (except us), how do we explain what we’re doing to these executives? What are we doing with their money? This is what our data tells them.
Insurance: need to underwrite policies, claims need to be figured out, recover losses, coordinate with law enforcement.
Auditors: Present in Naval aviation too, how do we validate our work and how to do they continuous monitoring.


POWER OF JSON:

High compression (net & disk)/ REST powered Transmission/ Easy to Hash and Version/ Hierarchical Structures
Charles: ”Great way of storing data. Recalling the birth of XML schema, JSON is hierarchical structure of data, putting things and organizing the to give them use case. Like building a molecule. Structured JSON objects, great compression entropy. Slowest thing in pipelines is I/O, not CPUs, if can be compressed, get really good throughput. If hash doesn’t match, can hash big groups into small hashes.”
JSON SHAREABLE OBJECTS: Incident collections, Threat Intel, Bulletins, Job Execution, reports. “Situation reports on how ready an organization is.”
JSON VISUALIZATION-d3js: MIT License/ JSON data/ Dozens of easy JSON to chart visualizations. “Can make in format for widgets.”
JSON GRAPH VISUALIZATION- Cytoscope.js: MIT License/ JSON Data/ Graph Relationship interaction/ Bioinformatic Research. “JSON great for shipping, Rest APIs are built on JSON, can put and update to JSON, whole of internet is sharing of JSONS.”


DATA COMPREHENSION:

Semantic Framing (Grammar); Framing Validation, logical computer formats/ Data Validation; Data context (Encyclopedia), Data Inference (Chatter)/ Low Compute Cost at High Rate/ Natural Language Processing/ Generative Pre-Trained LLMs.
Charles: “When dealing with cybersecurity, we are dealing with law enforcement and national security, it’s important to know how to understand context and words, who is who giving this information, how to validate information, Generative Pretrained LLMs (GPTs) can be fed gigabytes of parsing and it can figure out who may have wrote it, and write the code to parse it. If we drop data into a data lake without comprehension is like reading a book without engaging, wasting time if not attentive to these details.”


SIGNALS TO GRAPH TO WORK UNITS:

Artifacts tell info about machines:
Charles: ”Details to show (node) exists, client name, user, file, and tells a relationship between client and user shows relationship across nodes, can use comprehensive message to help infer infected nodes. Goes from signal (artifacts) to a graph.”

GRAPH VS CRIME THEORY:

Meaningful Graph Relationships/ Modus Operandi (MO) of Attacker/ Combines, standardizes diverse data/ Hierarchical JSON/ SECOPS &LE Unit of Work
Charles: “Many things can be graphed, data backup or power supply, can build inferences from graphs. JSON object is collection of nodes and characteristics.”

GRAPH ATTRIBUTION TO CAMPAIGNS: “Can build associations”
VECTOR COMPARISON FOR ANALYSIS: “When you look at vector comparison, work with AI to look at relationships and infer common similarities. “Looks like”, doesn’t have to be within our work, can be shared to broader community, seen on Navy ships, Private sector, and China. Works for elevating or assisting incident responder at the least.”

BIG DATA CYBERSECURITY PIPELINE:

Charles: “Data in data lake pushed into pipeline, (pulsar, NAX, compta), producer producing messages on pipe, consumers read some of these messages out, creating an understood signal, recognizing contents with files. Taking these artifacts and building out a graph, here’s what signals are telling us, and how they’re interacting with each other. Goes out into a graph. Graph units are analyzed to identify problems in graph. Collection of relationships can indicate, vector takes units of work to human system of vectors built off unit of work, vectors extract characteristics, when it receives query of what other things are like, vector can answer. Use cases from data originating from same place, Forensics, Reverse engineering, Looking at metadata, logs, interpreting graph, Audit reports use graph signals that corroborate node relationships. Can look for machines that don’t have certain system settings, processes, software, or other implementations. Business reporting: on the commercial side, how much money is spent, how much money is given, how much money saved. This tool reduces labor costs of calculation and accounting, and overall human labor and error, can reduce risk for executive business decision makers. Can help to identify compliance gaps.
These pipelines are increasingly being thought out. Can do things that overall help out more broadly. Can pull resources from reports and the data gets better, use cases inform what data should be going in. Have a lot of tools that currently look at events per second, a triage, very dangerous for cybersecurity.”

SUMMARY:

 Start Data strategies with user needs/ “Predestined” Data can live a meaningful life/Stay mindful of hardware costs of decisions/ JSON versions are flexible, powerful and portable.
Charles: “Before you start collecting data, identity an endgame, what is being done/who will see this data? How does it inform auditors? How can we build these pipelines
the idea of data predestination is very liberating, modifying the pipeline to find out what needs to be added for the graph. All can pertain to a NoSQL DB, JSON easy to add fields and not break anything. Traditionally could take a while and be error prone.”

RESOURCES: “Powered by Witfoo”
Free Training on WitFoo Community
Educational platform, “no pricing, just spin up labs” US law enforcement free licensing
Raspberry Pi 4 demo licensing for training
www.witfoo.com or charles@witfoo.com

QUESTIONS:

AI advice?: H20GPT use acse: put documents in user data and it loads into vectored DB, prefers VB8 open source DB, can tell it to find data on xys and it knows what’s in your data, and can pull what it thinks you’re looking for. Great way to turn a data lake into something quasi-useful, is still an AI built off models, works with GPT for all (he can’t get it to work, but it is a way) DATA LAKE TO DATA WAREHOUSE = LET THE AI SOLVE IT.
Essentially a file share, all built around different permissions schemes (DAC, RBAC, very granular on the data lake side)
NoSQL is same type of security that would be on SQL DB, can be done at operations level, application level or DB level.
Cassandra (NoSQL) encryption supported in process and at rest. If building a data lake, do security before you start putting data in it otherwise you will have to wipe it in order to implement.

Want to make sure a non-AI is Greenlighting these things AI are doing- MUST HAVE HUMAN LEVEL VALIDATION, can impact decisions made by humans.

Which AI are you using? What obstacles are you seeing?: Translation of SYSLOG messages; running on 12B parameter model,  some models are licenses for non-commercial use, can’t be used, Meta’s can only be educational, would love to create own model but would cost $$$, running the 12B parameter model takes about (without anything) is 24 gigs of GPU memory, the more Gigs, the more expensive. Training turned out to be easier than he thought, focused on what is being fed and built scrapers to determine candidates. Not a lot of good publicly available training materials, is tough. Finding good documentation is a challenge. LLM leaderboard on Higginfaces website (git hub for AI models) can download model, just load it into AI, H20GPT is Apache2 licensed, setup was black magic to getting all to compile, had to create some makes, build some drivers, not very mature on Open-Source side.
All in Python. Trying to keep libraries in GO is a pain—current state of AI maturity and cost have been biggest challenges. Want a clean model but models are built off unclean data. Need to have enough data to balance the bias. Data is also an issue. Can’t train OpenGPT or ChatGPT, opening up on the Azure side,
AI is lifechanging, can write codes like scrapers, study translation into other languages, things that could be several different queries have been streamlined using these models. Very useful and “TRYING TO MAKE ME LAZIER”

Recommendations to get smart with AI? Google’s got a free course, have you heard anything about recommended courses for folks to get started?: like having a conversation, you’ll learn while using, the prompting of it is very effective, best advice is to start using it. All kinds of stuff to try. Prompt engineering is a good way to start, best way to get a good answer is to ask the right question. Tell it something so it knows the context. Start with basic informing questions, the “My data”, LLM knows its trained data, “session data” telling it the context. Can ask it to summarize a repo, contextualize, ask it to show you how it would write it to correct it. Through prompting give it context/starting point, talk through it as if you were talking to an expert, sometimes it’s not perfect but that’s the magic.
Not all AI sites will not keep the context, build off questions. Lot of ways other than asking it a question to integrate it into other things.

Courses available to integrate it into business?

Google engineer python code generator project: write me a program that will integrate with excel, etc., will write out the code for that. Like building a library, not just one file. Limitations of Bing or ChatGPT can be limited in what it outputs.
Modeling and inferences will continue to evolve. A lot of libraries for working with AI are through Python. Connecting to APIs and web sockets there is no documentation for, needed to write sets of code to determine type of API, how to connect, instructional and summarization, giving it context beforehand and getting an output text and correcting accordingly.


BUSINESS MEETING:

Old: all board positions filled, lots of opportunities for chairs, leadership, creating new opportunities, if you have a passion and something you can contribute.
New: Social coming up, New Discord, New LinkedIn, New Website Updates with meeting recap. Charles will put out an email to all members to Join Discord.
Networking Session/Happy Hour- Mid Month Wednesday august 30th @ 5:30 PM at the Casual Pint, 338- Princess Anne rd, suite 111 Virginia Beach VA, 23456
Volunteer Events: (SEE VOLUNTEER OPPURTUNITY at start of meeting) Meeting Minutes very detailed, posted on website.

MEMBERSHIP UPDATES:

can reach out to Charles @issahr.gmail.com to help out with membership, ISSA .org.

AUGUST 2023 MEETING-TREASURER REPORT:

No transactions for this month, $5,758 ending balance, pizza was sponsored last month. Did reach out to Tidewater science fair we normally collab with, do prizes for the students, rewarding experience for the club. Need income or people to sponsor on behalf of us. Want to do more in the community.
ISSA SOCIAL MEDIA: slowed down growth, New LinkedIn PAGE. Members encouraged to follow new page, as well as groups (two LinkedIn presence), Discord also new, currently have 12 members. (see slide), hoping to push Discord. Conference attendance should help.

MEETING ADJOURNED.