Kestrel Threat Hunting Language

Overview

Kestrel Threat Hunting Language

https://img.shields.io/pypi/pyversions/kestrel-lang https://img.shields.io/pypi/v/kestrel-lang https://img.shields.io/pypi/dm/kestrel-lang Documentation Status

What is Kestrel? Why we need it? How to hunt with XDR support? What is the science behind it?

You can find all the answers at Kestrel documentation hub. A quick primer is below.

Overview

Kestrel threat hunting language provides an abstraction for threat hunters to focus on what to hunt instead of how to hunt. The abstraction makes it possible to codify resuable hunting knowledge in a composable and sharable manner. And Kestrel runtime figures out how to hunt for hunters to make cyber threat hunting less tedious and more efficient.

Kestrel overview.

  • Kestrel language: a threat hunting language for a human to express what to hunt.
    • expressing the knowledge of what in patterns, analytics, and hunt flows.
    • composing reusable hunting flows from individual hunting steps.
    • reasoning with human-friendly entity-based data representation abstraction.
    • thinking across heterogeneous data and threat intelligence sources.
    • applying existing public and proprietary detection logic as analytics.
    • reusing and sharing individual hunting steps and entire hunt books.
  • Kestrel runtime: a machine interpreter that deals with how to hunt.
    • compiling the what against specific hunting platform instructions.
    • executing the compiled code locally and remotely.
    • assembling raw logs and records into entities for entity-based reasoning.
    • caching intermediate data and related records for fast response.
    • prefetching related logs and records for link construction between entities.
    • defining extensible interfaces for data sources and analytics execution.

Installation

Kestrel requires Python 3.x to run. Check Python installation guide if you do not have Python. It is preferred to install Kestrel runtime using pip, and it is preferred to install Kestrel runtime in a Python virtual environment.

  1. Update Python installer.
$ pip install --upgrade pip setuptools wheel
  1. Install Kestrel runtime.
$ pip install kestrel-lang
  1. Install Kestrel Jupyter kernel if you use Jupyter Notebook to hunt.
$ pip install kestrel-jupyter
$ python -m kestrel_jupyter_kernel.setup
  1. (Optional) download Kestrel analytics examples for the APPLY hunt steps.
$ git clone https://github.com/IBM/kestrel-analytics.git

Hello World Hunt

  1. Copy the following 3-step hunt flow into your favorite text editor:
# create four process entities in Kestrel and store them in the variable `proclist`
proclist = NEW process [ {"name": "cmd.exe", "pid": "123"}
                       , {"name": "explorer.exe", "pid": "99"}
                       , {"name": "firefox.exe", "pid": "201"}
                       , {"name": "chrome.exe", "pid": "205"}
                       ]

# match a pattern of browser processes, and put the matched entities in variable `browsers`
browsers = GET process FROM proclist WHERE [process:name IN ('firefox.exe', 'chrome.exe')]

# display the information (attributes name, pid) of the entities in variable `browsers`
DISP browsers ATTR name, pid
  1. Save to a file helloworld.hf.
  2. Execute the hunt flow in a terminal (in Python venv if virtual environment is used):
$ kestrel helloworld.hf

Now you captured browser processes in a Kestrel variable browsers from all processes created:

       name pid
 chrome.exe 205
firefox.exe 201

[SUMMARY] block executed in 1 seconds
VARIABLE    TYPE  #(ENTITIES)  #(RECORDS)  process*
proclist process            4           4         0
browsers process            2           2         0
*Number of related records cached.

Hunting In The Real World

  1. How to develop hunts interactively in Jupyter Notebook?
  2. How to connect to one and more real-world data sources?
  3. How to write and match a TTP pattern?
  4. How to find child processes of a process?
  5. How to find network traffic from a process?
  6. How to apply pre-built analytics?
  7. How to fork and merge hunt flows?

Find more at Kestrel documentation hub.

Connecting With The Community

Quick questions? Like to meet other users? Want to contribute? Join our Kestrel slack workspace.

Comments
  • Syntax simplification

    Syntax simplification

    Is your feature request related to a problem? Please describe. Discussion and planning for syntax revision. Some ideas under discussion:

    1. redundant entity type in GET
    x = GET process FROM datasource WHERE [process:pid = 123]
    

    Simplified

    x = GET process FROM datasource WHERE pid = 123
    
    1. redundant entity type in GET from variable
    w = GET process FROM z WHERE [process:pid = 123]
    

    Simplified

    w = z WHERE pid = 123
    
    1. expression
    <var> [FILTER] [ AGG [ FILTER]] [SORT] [OFFSET] [LIMIT]
    

    may use in DISP and COPY

    enhancement 
    opened by subbyte 4
  • Unable to query data from elasticsearch

    Unable to query data from elasticsearch

    Describe the bug Hi, I am trying to follow the tutorial from the documentation hub using an ELK stack. However, I am getting a KestrelSyntaxError when querying. I tried it with Python 3.6 and 3.9; both have the same error results.

    Details of the bug

    • What is the hunt flow/script you are executing? Hunt flow from the tutorial.
    • What is the command that failed?
    var = GET process FROM stixshifter://host101
    
    • What is the error message?
    [ERROR] KestrelSyntaxError: invalid token "" at line 1 column 24. rewrite the failed statement.
    

    To Reproduce Steps to reproduce the behavior:

    1. Setup Symon & Elasticsearch
    2. Create API key on Elasticsearch for access
    3. Test Elasticsearch access using API key
    4. Configure environment variables
    $ export STIXSHIFTER_HOST101_CONNECTOR=elastic_ecs
    $ export STIXSHIFTER_HOST101_CONNECTION='{"host":"REDACTED.elastic-cloud.com", "port":9243, "indices":"winlogbeat-7.14.0-2021.08.04-000001"}'
    $ export STIXSHIFTER_HOST101_CONFIG='{"auth":{"id":"REDACTED", "api_key":"REDACTED"}}'
    
    1. Test using stix-shifter:
    $ stix-shifter transmit elastic_ecs '{"host":"REDACTED.elastic-cloud.com", "port":9243, "indices":"winlogbeat-7.14.0-2021.08.04-000001"}' '{"auth":{"id":"REDACTED", "api_key":"REDACTED"}}' ping
    
    {
    
        "success": true,
    
        "data": "{\n  \"cluster_name\" : \"66a63ad60eae4e2b9fb38f524b8defcc\",\n  \"status\" : \"green\",\n  \"timed_out\" : false,\n  \"number_of_nodes\" : 3,\n  \"number_of_data_nodes\" : 2,\n  \"active_primary_shards\" : 86,\n  \"active_shards\" : 172,\n  \"relocating_shards\" : 0,\n  \"initializing_shards\" : 0,\n  \"unassigned_shards\" : 0,\n  \"delayed_unassigned_shards\" : 0,\n  \"number_of_pending_tasks\" : 0,\n  \"number_of_in_flight_fetch\" : 0,\n  \"task_max_waiting_in_queue_millis\" : 0,\n  \"active_shards_percent_as_number\" : 100.0\n}\n"
    
    }
    
    1. Run jupyter notebook with command
    var = GET process FROM stixshifter://host101
    [ERROR] KestrelSyntaxError: invalid token "" at line 1 column 24. rewrite the failed statement.
    

    Expected behavior Results from query

    Environment (please complete the following information):

    • OS: Ubuntu 20.04
    • Python version: Python 3.9.5, Python 3.6.9
    • Python install environment: Python virtual environment
    • STIX-Shifter version: 3.5.0
    bug 
    opened by kinzhong 4
  • ValueError: Unrecognised argument(s): force

    ValueError: Unrecognised argument(s): force

    Describe the bug ValueError when running Hello World Hunt using Python 3.6.9. Installed using pip install kestrel-lang.

    Details of the bug

    • What is the hunt flow/script you are executing? Hello World Hunt from readme
    • What is the command that failed?
    $ kestrel helloworld.hf
    
    • What is the error message?
    $ kestrel helloworld.hf
    Traceback (most recent call last):
      File "/usr/local/bin/kestrel", line 8, in <module>
        runpy.run_module('kestrel', run_name='__main__')
      File "/usr/lib/python3.6/runpy.py", line 208, in run_module
        return _run_code(code, {}, init_globals, run_name, mod_spec)
      File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/usr/local/lib/python3.6/dist-packages/kestrel/__main__.py", line 49, in <module>
        logging_setup(None, args.verbose, args.debug)
      File "/usr/local/lib/python3.6/dist-packages/kestrel/__main__.py", line 33, in logging_setup
        force=True,
      File "/usr/lib/python3.6/logging/__init__.py", line 1829, in basicConfig
        raise ValueError('Unrecognised argument(s): %s' % keys)
    ValueError: Unrecognised argument(s): force
    

    To Reproduce

    1. pip install --upgrade pip setuptools wheel
    2. pip install kestrel-lang
    3. kestrel helloworld.hf

    Expected behavior Output from hunt flow.

    Environment (please complete the following information):

    • OS: Ubuntu 18.04
    • Python version: Python 3.6.9
    • Python install environment: Python virtual environment
    • STIX-Shifter version: 3.5.0
    bug 
    opened by kinzhong 4
  • In-STIX pattern variable auto complete does not work

    In-STIX pattern variable auto complete does not work

    Describe the bug If one tries to auto-complete a variable name in STIX pattern for a parameterized pattern, it does not work.

    Details of the bug This is limited to the STIX pattern parser we current use. Need to upgrade parser.

    bug 
    opened by subbyte 2
  • File paths can't have spaces

    File paths can't have spaces

    Describe the bug A parsing error is thrown when a file path has a space in it

    Details of the bug GET process FROM file:///a/path/with/a space/in_the_name/bundle.json

    Results in:

    lark.exceptions.UnexpectedCharacters: No terminal matches 's' in the current parser context, .....
    /a/path/with/a space/in_the_name/bundle.json`
                            ^
    Expected on of:
                    * WHERE
    

    To Reproduce Try to run GET on a file:// bundle with a space in the name

    Expected behavior A clear and concise description of what you expected to happen.

    Screenshots If applicable, add screenshots to help explain your problem.

    Environment (please complete the following information):

    • OS: macOS 11.6
    • Python version: 3.7.7
    • Python install environment:
    • STIX-Shifter version: latest github develop branch.

    Additional context Add any other context about the problem here.

    bug 
    opened by imolloy 2
  • Error reporting from analytics

    Error reporting from analytics

    Some analytics (regardless of which interface they use) may call third party APIs, particularly those doing threat intel enrichment. Sometimes those APIs may fail, either due to authentication issues, temporary network problems, etc. There is currently no way for the user to be notified of such problems.

    There should be some way for analytics to capture such error information and report it back up. The implementation may differ per analytics interface (e.g. a native python interface where the analytics run under that same python interpreter as the core can probably just raise an exception, while the docker interface may need to write the information to a file).

    enhancement 
    opened by pcoccoli 2
  • sqlite3.OperationalError: near

    sqlite3.OperationalError: near "ON": syntax error

    When applying helloworld.hf file as a parameter to kestrel via cli, the following error message appears:

    [docker@docker ~]$ kestrel helloworld.hf --debug 16:19:00 DEBUG kestrel.session Establish session with session_id: None, runtime_dir: None, store_path:None, debug_mode:True 16:19:00 DEBUG kestrel.session Configuration file /kestrel/kestrel.toml does not exist. 16:19:00 DEBUG kestrel.session Configuration file etc/kestrel/kestrel.toml does not exist. 16:19:00 DEBUG kestrel.session Configuration file /home/docker/.local/etc/kestrel/kestrel.toml loaded successfully. 16:19:00 DEBUG kestrel.session Configuration file /home/docker/.config/kestrel/kestrel.toml does not exist. 16:19:00 DEBUG kestrel.session Configuration loaded: {'session': {'local_database_path': 'local.db', 'debug_env_var_name': 'KESTREL_DEBUG'}, 'language': {'default_variable': '_', 'default_sort_order': 'desc'}, 'stixquery': {'timerange_start_offset': -300, 'timerange_stop_offset': 300, 'support_id': False}, 'prefetch': {'get': True, 'find': True, 'process_name_change_timerange_start_offset': -5, 'process_name_change_timerange_stop_offset': 5, 'process_lifespan_start_offset': -10800, 'process_lifespan_stop_offset': 10800}} 16:19:00 DEBUG kestrel.session create new session runtime_directory: /tmp/kestrel-session-212ddaa5-c492-41c7-8c1c-0639a1eb82cd. 16:19:00 DEBUG firepit.sqlitestorage Connection to SQLite DB /tmp/kestrel-session-212ddaa5-c492-41c7-8c1c-0639a1eb82cd/local.db successful 16:19:00 DEBUG firepit.sqlitestorage Executing query: CREATE TABLE IF NOT EXISTS "__symtable" (name TEXT, type TEXT, appdata TEXT); 16:19:00 DEBUG firepit.sqlitestorage Executing query: CREATE TABLE IF NOT EXISTS "__membership" (sco_id TEXT, var TEXT); 16:19:00 DEBUG firepit.sqlitestorage Executing query: CREATE TABLE IF NOT EXISTS "__queries" (sco_id TEXT, query_id TEXT); 16:19:01 DEBUG kestrel.codegen.commands Executing 'new' with statement: {'command': 'new', 'type': 'process', 'data': '[ {"name": "cmd.exe", "pid": "123"}\n , {"name": "explorer.exe", "pid": "99"}\n , {"name": "firefox.exe", "pid": "201"}\n , {"name": "chrome.exe", "pid": "205"}\n ]', 'output': 'proclist'} 16:19:01 DEBUG firepit.splitter _create_table: "CREATE TABLE "process" ("name" TEXT,"pid" TEXT,"type" TEXT,"id" TEXT UNIQUE);" 16:19:01 DEBUG firepit.sqlitestorage Executing query: CREATE TABLE "process" ("name" TEXT,"pid" TEXT,"type" TEXT,"id" TEXT UNIQUE); 16:19:01 DEBUG firepit.sqlitestorage Executing query: CREATE INDEX "process_id" ON "process" ("id"); 16:19:01 DEBUG firepit.sqlstorage _upsert: "INSERT INTO "process" ("name", "pid", "type", "id") VALUES (?, ?, ?, ?) ON CONFLICT (id) DO UPDATE SET "name" = EXCLUDED."name", "pid" = EXCLUDED."pid", "type" = EXCLUDED."type";" Traceback (most recent call last): File "/home/docker/.local/bin/kestrel", line 8, in runpy.run_module('kestrel', run_name='main') File "/usr/lib64/python3.6/runpy.py", line 208, in run_module return _run_code(code, {}, init_globals, run_name, mod_spec) File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/docker/.local/lib/python3.6/site-packages/kestrel/main.py", line 49, in outputs = session.execute(huntflow) File "/home/docker/.local/lib/python3.6/site-packages/kestrel/session.py", line 262, in execute return self._execute_ast(ast) File "/home/docker/.local/lib/python3.6/site-packages/kestrel/session.py", line 437, in _execute_ast output_var_struct, display = execute_cmd(stmt, self) File "/home/docker/.local/lib/python3.6/site-packages/kestrel/codegen/commands.py", line 92, in wrapper return func(stmt, session) File "/home/docker/.local/lib/python3.6/site-packages/kestrel/codegen/commands.py", line 60, in wrapper ret = func(stmt, session) File "/home/docker/.local/lib/python3.6/site-packages/kestrel/codegen/commands.py", line 123, in new stmt["type"] = load_data(session.store, stmt["output"], stmt["data"], stmt["type"]) File "/home/docker/.local/lib/python3.6/site-packages/kestrel/codegen/data.py", line 30, in load_data store.load(output_entity_table, data, entity_type, query_id) File "/home/docker/.local/lib/python3.6/site-packages/firepit/sqlstorage.py", line 294, in load splitter.close() File "/home/docker/.local/lib/python3.6/site-packages/firepit/splitter.py", line 228, in close self.writer.write_records(obj_type, recs, self.schemas[obj_type], self.replace, self.query_id) File "/home/docker/.local/lib/python3.6/site-packages/firepit/splitter.py", line 153, in write_records self.store.upsert(cursor, tablename, obj, query_id) File "/home/docker/.local/lib/python3.6/site-packages/firepit/sqlstorage.py", line 224, in upsert cursor.execute(stmt, values) sqlite3.OperationalError: near "ON": syntax error 16:19:01 DEBUG firepit.sqlitestorage Closing SQLite DB connection

    bug question 
    opened by RukhsarRiazKhan 2
  • implement entity id attr pick-up mech and fix #31

    implement entity id attr pick-up mech and fix #31

    The PR fully addresses #31 and partially addresses #32.

    1. implement a new function get_entity_id_attribute() in src/kestrel/codegen/relations.py to compute the appropriate attribute used as identifier attribute for entities.
    2. update src/kestrel/codegen/commands.py to use get_entity_id_attribute().
    3. replace the or_pattern() for post-prefetch merge in src/kestrel/codegen/commands.py with firepit.merge() (partially address #32).
    4. update get_variable_entity_count() in src/kestrel/codegen/summary.py to use get_entity_id_attribute().
    5. update _get_variable_query_ids() in src/kestrel/codegen/summary.py since merged variable in firepit do not have __membership records.
    6. update gen_variable_summary() in src/kestrel/codegen/summary.py to only give cached records when there is a data source query.
    opened by subbyte 2
  • github pypi CI/CD workflow

    github pypi CI/CD workflow

    define GitHub Action for automatic package release to pypi https://packaging.python.org/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/

    enhancement 
    opened by subbyte 2
  • fix version issue for automatic connector install

    fix version issue for automatic connector install

    This is a patch from Kestrel side. A upstream patch on stix-shifter will be better preventing the issue: https://github.com/opencybersecurityalliance/stix-shifter/issues/1087

    opened by subbyte 1
  • Don't require schemes for `FROM` and `APPLY`

    Don't require schemes for `FROM` and `APPLY`

    Is your feature request related to a problem? Please describe. The hunter doesn't care if an analytic is using docker or python.

    Describe the solution you'd like APPLY my_analytic should work whether it's a python- or docker-based analytic. Similarly FROM my_datasource should work without having to specify stixshifter://my_datasource.

    Maybe we have something in config that lists interface preference order? E.g. python,docker so it checks python first, then docker. Stop at first match.

    You could still supply the scheme, to force the one you want in case of name collisions.

    Describe alternatives you've considered Leave it like it is now.

    Additional context N/A

    enhancement 
    opened by pcoccoli 0
  • relax single quote requirement for attribute with dash

    relax single quote requirement for attribute with dash

    Is your feature request related to a problem? Please describe. In STIX pattern, a property or partial property that has dash - in it needs to be wrapped with single quotes, such as [file:hashes.'SHA-256' = 'xxxxxxxxx...']. This mean in Kestrel, one needs to write GET file WHERE hashes.'SHA-256' = 'xxxxxx...'. This rule may not be expected by most users. Thinking to relax it so users can write GET file WHERE hashes.SHA-256 = 'xxxxxx...' and Kestrel will assemble the STIX pattern with single quotes if needed.

    Note that Kestrel is STIX compatible, so if we implement this, it will still allow users to have single quotes like hashes.'SHA-256', in which case Kestrel will not modify the string when assembling the STIX pattern.

    Describe the solution you'd like firepit also needs the single quotes. So we can possibly add the single quotes if not there around substrings in attributes with dashes in the parser (transformer).

    Describe alternatives you've considered Do the modification in to_stix() and to_firepit() in ECGP.

    Additional context Additional consideration is whether this (difference from STIX) makes extra confusion for users who are familiar with STIX. However, since the planned solution supports both (just relaxing the strict single quote requirement), this could be fine.

    enhancement 
    opened by subbyte 0
  • autocomplete function doesn't behave correctly

    autocomplete function doesn't behave correctly

    Description: The autocompletion function doesn't properly address partially complete fields correctly. Fields with the same starter characters as commands also return commands matching last_word as suggestions (which they should not be doing). No error messages are outputted, but the behavior does not match what is expected. Examples included below for clarity. A possible solution mentioned is completely revamping the logic of the docomplete() function, looking at the parsing portion of the existing code in particular.

    Environment: I verified that the variable autocompletion error appears in the Tutorial Huntbook environment, so I believe any Kestrel runtime environment through Jupyter Notebook should display this issue. For attribute autocompletion, I was running my Kestrel environment from Jupyter Notebook locally hosted through a Python 3 virtual environment on Windows 11 WSL Ubuntu 20.04.5 LTS. (I have no idea if the grammar there was correct, sorry...)

    Details & How to Reproduce: Huntflow taken directly from the Kestrel tutorial (0. Hello World Hunt). In the code block below, <tab> represents hitting the tab button, which calls the autocomplete function (do_complete()) linked above.

    proclist = NEW process [ {"name": "cmd.exe", "pid": "123"}
                           , {"name": "explorer.exe", "pid": "99"}
                           , {"name": "firefox.exe", "pid": "201"}
                           , {"name": "chrome.exe", "pid": "205"}
                           ]
    browsers = GET process FROM proclist WHERE name IN ('firefox.exe', 'chrome.exe')
    
    # --  scenario 1
    DISP <tab>                       # case 1
    DISP b<tab>                      # case 2
    DISP browsers<tab>               # case 3
    DISP browsers <tab>              # case 4
    
    # -- add and run this as a new block before calling DISP
    abc = browsers
    
    # --  scenario 2
    DISP a<tab>                      # case 2
    

    Expected Behavior: For scenario 1, all cases behave as expected for autocompletion. (suggestions is the list returned by the do_complete() function)

    1. suggestions = ['TIMESTAMPED', '_', 'browsers', 'proclist']
    2. suggestions = ['browsers']
    3. suggestions = ['']
    4. suggestions = ['APPLY', 'ATTR', 'DISP', 'FIND', 'GET', 'GROUP', 'INFO', 'JOIN', 'LIMIT', 'LOAD', 'NEW', 'OFFSET', 'SAVE', 'SORT', 'TIMESTAMPED', 'WHERE', '_', 'browsers', 'proclist']

    (My question: Why are browsers and proclist considered valid suggestions for case 4?)

    Scenario 2 can be generalized to all scenarios where a variable shares the starter characters for autocompletion as a command. Most cases behave as expected, EXCEPT Case 2 which returns suggestions = ['bc', 'pply', 'ttr']. The expected behavior would be suggestions = ['bc'].


    The following details are in regards to how this issue relates to another open issue (https://github.com/opencybersecurityalliance/kestrel-lang/issues/79), which details expanding the autocompletion feature to support attributes.

    # attribute autocompletion
    DISP browsers ATTR <tab>         # case 1
    DISP browsers ATTR n<tab>        # case 2
    DISP browsers ATTR name<tab>     # case 3
    DISP browsers ATTR name <tab>    # case 4
    

    My implementation of the attribute autocompletion feature can be found here. For case 2, the parser treats 'n' as a completed attribute field, but also as the value of last_word when searching for suggestions for the next field. As such, we end up with suggestions = ['ew'], which is wrong (and confusingly weird). The other cases behave as expected, though it might just be coincidental for case 3 in particular (the same applies for variable autocompletion). It seems that this behavior is the same as variable partial completion, so this issue must be addressed before progress can be made on the other.

    bug 
    opened by vereimyst 0
  • attribute may not be variable

    attribute may not be variable

    Describe the bug

    procs = GET process
            FROM file:///tmp/lab101.json
            WHERE parent_ref.name = 'svchost.exe'
            START 2021-04-03T00:00:00Z STOP 2021-04-03T02:00:00Z
            
    procs_grps = GROUP procs BY binary_ref.name WITH COUNT(pid) AS number_of_procs
    
    APPLY python://attribute-plot ON procs_grps WITH XPARAM=binary_ref.name, YPARAM=number_of_procs
    

    error:

    [ERROR] KestrelSyntaxError: invalid token "'binary_ref.name'" at line 6 column 29, expects one of ['BIN', 'ATTRIBUTE']
    rewrite the failed statement.
    

    Kestrel version: v1.5.1

    bug documentation 
    opened by subbyte 1
  • Explore/Test Kestrel deployment on MS Windows

    Explore/Test Kestrel deployment on MS Windows

    Is your feature request related to a problem? Please describe. Currently Kestrel is supported on Linux and macOS. It could be useful to explore deployment on Microsoft Windows, writing doc on how to set it up (if special instruction is needed similar to the macOS requirement), and fixing issues in code if needed.

    Describe the solution you'd like A first step is to test/support Kestrel running in Windows Subsystem for Linux. The second step is to test/support Kestrel running as a native Windows application (with Python environment installed).

    documentation enhancement Hacktoberfest 
    opened by subbyte 0
Releases(v1.5.3)
  • v1.5.3(Nov 24, 2022)

    1.5.3 (2022-11-23)

    Added

    • Multiple test cases for escaped string parsed with main/ECGP parsers

    Fixed

    • Escaped string in value for both ECGP and argument
    • Token prefix not handled in

    Changed

    • Use firepit time function for timestamp parsing
    • Update Lark rule transform to vtrans to avoid Lark special function misfire

    Removed

    • Explicit dependency python-dateutil
    Source code(tar.gz)
    Source code(zip)
  • v1.5.2(Oct 26, 2022)

    Added

    • Relative path support for environment variable starting with KESTREL #248
    • Relative path support for path in LOAD/SAVE
    • Relative path support for local uri, i.e., file://xxx or file://./xxx in GET
    • Unit test on relative path in environment variable
    • Unit test on relative path in LOAD
    • Unit test on relative path in data source in GET
    Source code(tar.gz)
    Source code(zip)
  • v1.5.1(Oct 25, 2022)

    Added

    • Type checking in kestrel.semantics.reference
    • New exception MissingDataSource
    • Unit test on variable reference in GET
    • Unit test on last data source reuse

    Fixed

    • Missing data source if not specified #257
    • SymbolTable type error in code generation

    Removed

    • Obsoleted exception UnsupportedStixSyntax
    Source code(tar.gz)
    Source code(zip)
  • v1.5.0(Oct 24, 2022)

    To be more friendly in the WHERE clause than strict STIX pattern, we introduce Extended Centered Graph Pattern (ECGP) in v1.5.0, plus complete Kestrel parser upgrade with multiple fixes (closing all issues in the Parser Upgrade milestone).

    • ECGP is STIX compatible, which means one can use STIX in WHERE clause as before.

    • The example of ECGP in WHERE (note that the host/endpoint is specified in a datasource, e.g., Elastic index, to avoid unnecessary data to retrieve by user or system generated queries):

    drawing
    • Documentation on ECGP will come in v1.5.1

    • Full changelog:

    Added

    • Introduce ExtendedCenteredGraphPattern (ECGP) for WHERE clause

      • Support optional SCO/entity type for centered graph (STIX compatible)
      • Support optional square brackets (STIX compatible)
      • Support Single or double quotes (STIX compatible)
      • Support nested list as value (STIX compatible)
      • Support Kestrel variable as reference
      • Support escaped characters in quoted value
      • Support ECGP to string/STIX/firepit transformation
      • Support ECGP pruning (centered or extended components)
      • Support ECGP merge/extend with another ECGP
      • Parse into STIX (now ECGP) #14
      • Normalize WHERE clause between GET and expression
      • Add WHERE clause to command FIND
    • Upgrade arguments (in APPLY command)

      • Support quoted string in arguments #170
      • dereferring variables in arguments
    • Upgrade path (in GET/APPLY/LOAD/SAVE command)

      • Support escaped characters in quoted datasrc/analytics/path
    • Upgrade JSON parser for command NEW

    • Upgrade operators in syntax to be case insensitive

    • Upgrade timespan

      • absolute timespan without t and quotes
      • relative timespan for FIND
    • Upgrade prefetch with WHERE clause to eliminate unnecessary query

    • Multiple test cases for new syntax and features

    • Add macOS (arm64) install requirement to documentation

    Changed

    • Limit STIXPATH to ATTRIBUTE

      • command: SORT, GROUP, JOIN
      • expression clause: sort, attr
    • Use explicit list like (1,2,3) or [1,2,3] for multi-value argument

    • Formalize semantics processor in parser-semantics-codegen procedure

      • variable dereferencing in semantics processor
      • variable timerange extraction in semantics processor
    Source code(tar.gz)
    Source code(zip)
  • v1.4.2(Sep 26, 2022)

    Added

    • links to Black Hat 2022 website, recording, and demo/lab
    • Kestrel logo in PNG
    • link to the Kestrel binder service blog post

    Fixed

    • consistent stix-shifter and connector versions

    Changed

    • lowercase grammar strings
    Source code(tar.gz)
    Source code(zip)
  • v1.4.1(Jul 28, 2022)

    Added

    • multi-user cache folder support in debug mode #236
    • ppid used in process identification (post-prefetch) #238
    • process identification upgraded to a two-step approach
    • fine-grained process identification time offsets
    • per entity type prefetch config support #241
    • support for automatically converting input files to STIX in stixbundle interface

    Fixed

    • prefetch when parent_ref not in process table
    • false positives in generic relation resolution
    • second execution of a failed query should raise exception
    • master runtime directory test case fix
    • ~ support in config file path (env var)
    Source code(tar.gz)
    Source code(zip)
  • v1.4.0(Jun 16, 2022)

    This release adds 2 new language features: relative timespans in place of exact timestamps in STIX patterns, and the ability to "bin" (aka "bucket") grouping attributes. "Binning" is a means of aggregating multiple entities into a single aggregate using a range of values (e.g. 5 minutes instead of grouping b exact timestamps).

    Fixed

    • Fix NameError: name 'DataSourceError' is not defined
    • Pass stix-shifter profile options into translation #230

    Added

    • Relative timespans instead of START/STOP #181
      • e.g. LAST 5 MINUTES
    • Group by "binned" (or "bucketed") attributes
      • e.g. GROUP foo BY BIN(first_observed, 5m)

    Changed

    • bump min Python version to 3.7
    • update OCA slack invitation link
    Source code(tar.gz)
    Source code(zip)
  • v1.3.4(May 16, 2022)

    Kestrel binder service now supports dynamically adding data sources.

    Fixed

    • broken /tmp/kestrel symbol link will crash a new session
    • double close (double release resources) with context manager and aexit
    • AttributeError with timestamped grouped variable #224
    • subsequent GET would return no results #228

    Added

    • documentation on macOS debug folder path
    • interface figure updated with new planned interfaces
    • dynamically load stix-shifter YAML profiles #227
    • new exception: MissingEntityAttribute
    • unit test: disp timestamped group by

    Changed

    • codecov GitHub App enabled instead of codecov-bot
    • stixshifter interface module connector split from interface.
    Source code(tar.gz)
    Source code(zip)
  • v1.3.3(Apr 29, 2022)

  • v1.3.2(Apr 22, 2022)

    Summary

    Stabilize v1.3 with many bug fixes; improve auto-completion; add code coverage.

    Details

    See CHANGELOG.rst for complete info.

    Added

    • runtime warning generation for invalid entity type #200
    • auto-complete relation in FIND
    • auto-complete BY and variable in FIND
    • add logo to readthedocs
    • upgrade auto-complete keywords to be case sensitive #213
    • add testing coverage into github workflows
    • add codecov badge to README
    • 31 unit tests for auto-completion
    • the first unit test for JOIN
    • two unit tests for ASSIGN
    • five unit tests for EXPRESSION
    • use tmp dir for generated testing data
    • auto-deref with mixed ipv4/ipv6 in network-traffic

    Fixed

    • missing _refs handling for 2 cases out of 4 #205
    • incorrectly derefering attributes after GROUP BY
    • incorrectly yielding variable when auto-completing relation in FIND
    • pylint errors about undefined-variables

    Changed

    • update grammar to separate commands yielding (or not) a variable
    • change FUNCNAME from a terminal to an inlined rule
    • differentiate the terminal "by"i between FIND and SORT/GROUP
    Source code(tar.gz)
    Source code(zip)
  • v1.3.1(Apr 17, 2022)

    Fix PyPI releasing issues, and update GitHub Action scripts to Python 3.10.

    Changed

    • GitHub Actions upgraded to setup-python@v3 + Python 3.10

    Fixed

    • The description failed to render when uploading to PyPI.
    • README.rst misses images when rendered at non-github sites, e.g., PyPI.
    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Apr 15, 2022)

    Added

    • internal data model upgraded to firepit 2.0.0 with full graph-like database schema:

      • new firepit data normalized schema: https://firepit.readthedocs.io/en/latest/database.html
      • the normalized schema extracts/recognizes entities/SCOs from STIX observations and stores them and their relations.
      • the normalized schema fully enables a Kestrel variable to refer to a list of homogeneous entities as a view in a relational-DB table.
      • older hunts will need to be re-executed.
    • syntax upgrade: introducing the language construct expression to process a variable, e.g., adding a WHERE clause, and the processed variable can be

      • assigned to another variable, so one does not need another GET command with a STIX pattern to do filtering.
      • passed to DISP, so DISP is naturally upgraded to support many clauses such as SORT, LIMIT, etc.
    • new syntax for initial events handling besides entities:

      • entities in a variable do not have timestamps anymore; previously all observations of the entities were listed in a variable with timestamps.
      • use the function TIMESTAMPED() to wrap a variable into an expression when the user needs timestamps of the observations/events in which the entities appeared. This is useful for analyzing and visualizing events of entities through time, e.g., time series analysis of visited ipv4-addr entities in a variable.
    • unit tests:

      • 5 more unit tests for command FIND.
      • 2 more unit tests for command SAVE.
      • 2 unit tests for expression TIMESTAMPED().
    • new syntax added to language reference documentation

      • TIMESTAMPED
      • DISP
      • assign
    • repo updates:

      • Kestrel logo created.
      • GOVERNANCE.rst including versioning, release procedure, vulnerability disclosure, and more.

    Removed

    • the copy command is removed (replaced by the more generic assign command).

    Changed

    • repo front-page restructured to make it shorter but providing more information/links.
    • the overview page of Kestrel doc is turned into a directory of sections. The URL of the page is changed from overview.html to overview.
    Source code(tar.gz)
    Source code(zip)
  • v1.2.3(Mar 23, 2022)

    Added

    • error message improvement: suggestion when a Python analytics is not found
    • performance improvement: cache STIX bundle for any downloaded bundle in the stix-bundle data source interface
    • performance improvement: pre-compile STIX pattern before matching in the stix-bundle data source interface
    • performance improvement: skip prefetch when the generated prefetch STIX pattern is the same as the user-specified pattern
    • documentation improvement: add building instructions for documentation
    • documentation improvement: add data source setup under Installation And Setup
    • documentation improvement: add analytics setup under Installation And Setup

    Fixed

    • STIX bundle downloaded without Last-Modified field in response header #187
    • case sensitive support for Python analytics profile name #189
    Source code(tar.gz)
    Source code(zip)
  • v1.2.2(Mar 2, 2022)

    Added

    • remote data store support
    • unit test: Python analytics: APPLY after GET
    • unit test: Python analytics: APPLY on multiple variables

    Fixed

    • bump firepit version to fix transaction errors
    • bug fix: verify_package_origin() takes 1 argument

    Removed

    • unit test: Python 3.6 EOL and removed from GitHub Actions
    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Feb 24, 2022)

  • v1.2.0(Feb 10, 2022)

    We are delighted to grow Kestrel with Python analytics interface in this release.

    Important New Features

    1. Python analytics interface, which supports all existing Kestrel analytics in the kestrel-analytics repo.
    2. Automatic STIX-shifter connector install, which verifies and installs STIX-shifter connectors when needed.
    3. New documentation on Python analytics and Kestrel debug mode.

    Detailed Changelog

    • Added
      • Kestrel main package
        • matplotlib figure support in Kestrel Display Objects
        • analytics interface upgraded with config shared to Kestrel
      • Python analytics interface
        • minimal requirement design for writing a Python analytics
        • analytics function environment setup and destroy
        • support for a variety of display object outputs
        • parameters support
        • stack tracing for exception inside a Python analytics
      • STIX-shifter data source interface
        • automatic STIX-shifter connector install
          • connector name guess
          • connector origin verification
          • comprehensive error and suggestion if automatic install failed
        • pretty print for exception inside a Docker analytics
      • documentation
        • Python analytics interface
        • Kestrel debug page
        • flag to disable certificate verification in STIX-shifter profile example
    • Changed
      • abstract interface manager between datasource/analytics for code reuse
    • Fixed
      • auto-complete with data source #163
      • exception for empty STIX-shifter profile
      • STIX-shifter profile name should be case insensitive
      • exception inappropriately caught when dereferencing vars with no time range
    • Removed
      • documentation about STIX-shifter connector install
    Source code(tar.gz)
    Source code(zip)
  • v1.1.7(Jan 27, 2022)

    This release focuses on upgrading Kestrel configuration management, solving #116 and #160 and paving road for #138.

    Added

    • standalone Kestrel config module to support modular and simplified Kestrel config loading flow
    • shareable-state of config between Kestrel session and any Kestrel data source interfaces
    • stix-shifter interface upgraded with shareable-state of config support
    • stix-shifter DEBUG level env var KESTREL_STIXSHIFTER_DEBUG
    • stix-shifter config/profile loading from disk ~/.config/kestrel/stixshifter.yaml
    • debug message logging in kestrel_datasource_stixshifter
    • documentation for Kestrel main config with default config linked/shown

    Changed

    • default Kestrel config not managed by pip any more
    • turn main Kestrel from TOML into YAML ~/.config/kestrel/kestrel.yaml
    • upgrade Kestrel data source interfaces API with new config parameter
    • default stix-shifter debug level to INFO
    • documentation upgrade for kestrel_datasource_stixshifter

    Fixed

    • Kestrel config upgrade inconsistency #116
    Source code(tar.gz)
    Source code(zip)
  • v1.1.6(Dec 15, 2021)

    Detect Log4Shell with Kestrel, see README for details

    Added

    • advanced code auto-completion with parser support

    Fixed

    • dollar sign incorrectly display in Jupyter Notebook (dataframe to html)

    Changed

    • installation documentation upgrade
    Source code(tar.gz)
    Source code(zip)
  • v1.1.4(Oct 27, 2021)

  • v1.1.3(Oct 9, 2021)

    We introduce comprehensive GROUP BY syntax, implementation, test, and documentation in this release, together with firepit upgrades.

    • GROUP BY multiple attributes
    • Aggregation function in GROUP BY
    • Support alias in GROUP BY
    • New test cases for GROUP BY
    • Documentation update for GROUP BY
    Source code(tar.gz)
    Source code(zip)
  • v1.1.2(Sep 13, 2021)

  • v1.1.1(Sep 3, 2021)

    Added

    • Minimal dependent package versions #67
    • Configration option to disable execution summary display #86
    • Auto-removal of obsolete session caches #34
    • SQLite requirement in installation documentation

    Fixed

    • Python 3.6 support on command line utility #97

    Changed

    • Adjusting logging message levels to avoid confusion
    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Aug 18, 2021)

    Composability Upgrade

    Now GROUP and SORT are like other commands and can be followed by any other commands such as GET and APPLY.

    Parser Upgrade

    Integer/float is now supported as values in the JSON given to command NEW.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.14(Aug 18, 2021)

  • v1.0.13(Aug 14, 2021)

    Fixed

    • Single quotes support in STIX patterns to fix #95
    • Variable summary deduplication

    Added

    • Expected components in syntax error messages
    Source code(tar.gz)
    Source code(zip)
  • v1.0.12(Aug 3, 2021)

  • v1.0.11(Aug 3, 2021)

  • v1.0.10(Jul 19, 2021)

    Fixed

    • Missing log in command line mode #84
    • Typo in documentation
    • Incorrect config file path

    Added

    • Select config file via environment variable #82
    Source code(tar.gz)
    Source code(zip)
  • v1.0.9(Jul 7, 2021)

Owner
Open Cybersecurity Alliance
The Open Cybersecurity Alliance (OCA) fosters a cybersecurity ecosystem for exchanging information, orchestrated responses, etc. OCA is an OASIS Open Project.
Open Cybersecurity Alliance
A python framework to transform natural language questions to queries in a database query language.

__ _ _ _ ___ _ __ _ _ / _` | | | |/ _ \ '_ \| | | | | (_| | |_| | __/ |_) | |_| | \__, |\__,_|\___| .__/ \__, | |_| |_| |___/

Machinalis 1.2k Dec 18, 2022
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

RITA DSL This is a language, loosely based on language Apache UIMA RUTA, focused on writing manual language rules, which compiles into either spaCy co

Šarūnas Navickas 60 Sep 26, 2022
Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG)

Indobenchmark Toolkit Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG) resources fo

Samuel Cahyawijaya 11 Aug 26, 2022
LegalNLP - Natural Language Processing Methods for the Brazilian Legal Language

LegalNLP - Natural Language Processing Methods for the Brazilian Legal Language ⚖️ The library of Natural Language Processing for Brazilian legal lang

Felipe Maia Polo 125 Dec 20, 2022
A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

Robert Bogan Kang 3 May 25, 2022
This is the Alpha of Nutte language, she is not complete yet / Essa é a Alpha da Nutte language, não está completa ainda

nutte-language This is the Alpha of Nutte language, it is not complete yet / Essa é a Alpha da Nutte language, não está completa ainda My language was

catdochrome 2 Dec 18, 2021
Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any language

Simple Python script to scrape youtube channles of "Parity Technologies and Web3 Foundation" and translate them to well-known braille language or any

Little Endian 1 Apr 28, 2022
NL. The natural language programming language.

NL A Natural-Language programming language. Built using Codex. A few examples are inside the nl_projects directory. How it works Write any code in pur

null 2 Jan 17, 2022
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

Computational Linguistics Research Group 8.4k Dec 30, 2022
Natural language Understanding Toolkit

Natural language Understanding Toolkit TOC Requirements Installation Documentation CLSCL NER References Requirements To install nut you need: Python 2

Peter Prettenhofer 119 Oct 8, 2022
💫 Industrial-strength Natural Language Processing (NLP) in Python

spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc

Explosion 24.9k Jan 2, 2023
The Classical Language Toolkit

Notice: This Git branch (dev) contains the CLTK's upcoming major release (v. 1.0.0). See https://github.com/cltk/cltk/tree/master and https://docs.clt

Classical Language Toolkit 754 Jan 9, 2023
🗣️ NALP is a library that covers Natural Adversarial Language Processing.

NALP: Natural Adversarial Language Processing Welcome to NALP. Have you ever wanted to create natural text from raw sources? If yes, NALP is for you!

Gustavo Rosa 21 Aug 12, 2022
Stand-alone language identification system

langid.py readme Introduction langid.py is a standalone Language Identification (LangID) tool. The design principles are as follows: Fast Pre-trained

null 2k Jan 4, 2023
A natural language modeling framework based on PyTorch

Overview PyText is a deep-learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapi

Facebook Research 6.4k Dec 27, 2022
Basic Utilities for PyTorch Natural Language Processing (NLP)

Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor

Michael Petrochuk 2.1k Jan 1, 2023
Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Recurrent VLN-BERT Code of the Recurrent-VLN-BERT paper: A Recurrent Vision-and-Language BERT for Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian

YicongHong 109 Dec 21, 2022
A raytrace framework using taichi language

ti-raytrace The code use Taichi programming language Current implement acceleration lvbh disney brdf How to run First config your anaconda workspace,

蕉太狼 73 Dec 11, 2022