AI-based, context-driven network device ranking

Related tags

Deep Learning batea
Overview

Python package

logo

Batea

A batea is a large shallow pan of wood or iron traditionally used by gold prospectors for washing sand and gravel to recover gold nuggets.

Batea is a context-driven network device ranking framework based on the anomaly detection family of machine learning algorithms. The goal of Batea is to allow security teams to automatically filter interesting network assets in large networks using nmap scan reports. We call those Gold Nuggets.

For more information about Gold Nuggeting and the science behind Batea, check out our whitepaper here

You can try Batea on your nmap scan data without downloading the software, using Batea Live: https://batea.delvesecurity.com/

How it works

Batea works by constructing a numerical representation (numpy) of all devices from your nmap reports (XML) and then applying anomaly detection methods to uncover the gold nuggets. It is easily extendable by adding specific features, or interesting characteristics, to the numerical representation of the network elements.

The numerical representation of the network is constructed using features, which are inspired by the expertise of the security community. The features act as elements of intuition, and the unsupervised anomaly detection methods allow the context of the network asset, or the total description of the network, to be used as the central building block of the ranking algorithm. The exact algorithm used is Isolation Forest (https://en.wikipedia.org/wiki/Isolation_forest)

Machine learning models are the heart of Batea. Models are algorithms trained on the whole dataset and used to predict a score on the same (and other) data points (network devices). Batea also allows for model persistence. That is, you can re-use pretrained models and export models trained on large datasets for further use.

Usage

# Complete info
$ sudo nmap -A 192.168.0.0/16 -oX output.xml

# Partial info
$ sudo nmap -O -sV 192.168.0.0/16 -oX output.xml


$ batea -v output.xml

Installation

$ git clone [email protected]:delvelabs/batea.git
$ cd batea
$ python3 setup.py sdist
$ pip3 install -r requirements.txt
$ pip3 install -e .

Developers Installation

$ git clone [email protected]:delvelabs/batea.git
$ cd batea
$ python3 -m venv batea/
$ source batea/bin/activate
$ python3 setup.py sdist
$ pip3 install -r requirements-dev.txt
$ pip3 install -e .
$ pytest

Example usage

# simple use (output top 5 gold nuggets with default format)
$ batea nmap_report.xml

# Output top 3
$ batea -n 3 nmap_report.xml

# Output all assets
$ batea -A nmap_report.xml

# Using multiple input files
$ batea -A nmap_report1.xml nmap_report2.xml

# Using wildcards (default xsl)
$ batea ./nmap*.xml
$ batea -f csv ./assets*.csv

# You can use batea on pretrained models and export trained models.

# Training, output and dumping model for persistence
$ batea -D mymodel.batea nmap_report.xml

# Using pretrained model
$ batea -L mymodel.batea nmap_report.xml

# Using preformatted csv along with xml files
$ batea -x nmap_report.xml -c portscan_data.csv

# Adjust verbosity
$ batea -vv nmap_report.xml

How to add a feature

Batea works by assigning numerical features to every host in the report (or series of report). Hosts are python objects derived from the nmap report. They consist of the following list of attributes: [ipv4, hostname, os_info, ports] where ports is a list of ports objects. Each port has the following list of attributes : [port, protocol, state, service, software, version, cpe, scripts], all defaulting to None.

Features are objects inherited from the FeatureBase class that instantiate a specific _transform method. This method always takes the list of all hosts as input and returns a lambda function that maps each host to a numpy column of numeric values (host order is conserved). The column is then appended to the matrix representation of the report. Features must output correct numerical values (floats or integers) and nothing else.

Most feature transformations are implemented using a simple lambda function. Just make sure to default a numeric value to every host for model compatibility.

Ex:

class CustomInterestingPorts(FeatureBase):
    def __init__(self):
        super().__init__(name="some_custom_interesting_ports")

    def _transform(self, hosts):
      """This method takes a list of hosts and returns a function that counts the number
      of host ports member from a predefined list of "interesting" ports, defaulting to 0.

      Parameters
      ----------
      hosts : list
          The list of all hosts

      Returns
      -------
      f : lambda function
          Counts the number of ports in the defined list.
      """
        member_ports = [21, 22, 25, 8080, 8081, 1234]
        f = lambda host: len([port for port in host.ports if port.port in member_ports])
        return f

You can then add the feature to the report by using the NmapReport.add_feature method in batea/__init__.py

from .features.basic_features import CustomInterestingPorts

def build_report():
    report = NmapReport()
    #[...]
    report.add_feature(CustomInterestingPorts())

    return report

Using precomputed tabular data (CSV)

It is possible to use preprocessed data to train the model or for prediction. The data has to be indexed by (ipv4, port) with one unique combination per row. The type of data should be close to what you expect from the XML version of an nmap report. A column has to use one of the following names, but you don't have to use all of them. The parser defaults to null values if a column is absent.

  'ipv4',
  'hostname',
  'os_name',
  'port',
  'state',
  'protocol',
  'service',
  'software_banner',
  'version',
  'cpe',
  'other_info'

Example:

ipv4,hostname,os_name,port,state,protocol,service,software_banner
10.251.53.100,internal.delvesecurity.com,Linux,110,open,tcp,rpcbind,"program version   port/proto  service100000  2,3,4        111/tcp  rpcbind100000  2,3,4    "
10.251.53.100,internal.delvesecurity.com,Linux,111,open,tcp,rpcbind,
10.251.53.188,serious.delvesecurity.com,Linux,6000,open,tcp,X11,"X11Probe: CentOS"

Outputing numerical representation

For the data scientist in you, or just for fun and profit, you can output the numerical matrix along with the score column instead of the regular output. This can be useful for further data analysis and debug purpose.

$ batea -oM network_matrix nmap_report.xml
Comments
  • Fails to parse nmap output files

    Fails to parse nmap output files

    ❯ batea -h
    Usage: batea [OPTIONS] [NMAP_REPORTS]...
    
      Context-driven asset ranking based using anomaly detection
    
    Options:
      -c, --read-csv FILENAME
      -x, --read-xml FILENAME
      -n, --n-output INTEGER
      -A, --output-all
      -L, --load-model FILENAME
      -D, --dump-model FILENAME
      -f, --input-format TEXT
      -v, --verbose
      -oM, --output-matrix FILENAME
      -h, --help                     Show this message and exit.
    ❯ batea -v -x output_3.xml
    /home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/sklearn/ensemble/iforest.py:478: RuntimeWarning: invalid value encounte                                                                                                                                              red in true_divide
      -depths
    /home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/sklearn/ensemble/iforest.py:478: RuntimeWarning: invalid value encounte                                                                                                                                              red in true_divide
      -depths
    {
        "report_info": [
            {
                "number_of_hosts": 1,
                "features": [
                    "ip_octet_0",
                    "ip_octet_1",
                    "ip_octet_2",
                    "ip_octet_3",
                    "port_count",
                    "open_port_count",
                    "low_port_count",
                    "tcp_port_count",
                    "named_service_count",
                    "software_banner_count",
                    "max_banner_length",
                    "is_windows",
                    "is_linux",
                    "http_server_count",
                    "database_count",
                    "windows_domain_admin_count",
                    "windows_domain_member_count",
                    "port_entropy",
                    "hostname_length",
                    "hostname_entropy"
                ]
            }
        ],
        "host_info": [
            {
                "rank": "1",
                "host": "x.x.x.113",
                "score": NaN,
                "hostname": null,
                "os": {
                    "vendor": "Linux",
                    "family": "Linux",
                    "type": "general purpose",
                    "name": "Linux 3.10 - 4.8",
                    "accuracy": 96
                },
                "features": {
                    "ip_octet_0": x,
                    "ip_octet_1": x,
                    "ip_octet_2": x,
                    "ip_octet_3": 113.0,
                    "port_count": 3.0,
                    "open_port_count": 3.0,
                    "low_port_count": 3.0,
                    "tcp_port_count": 3.0,
                    "named_service_count": 3.0,
                    "software_banner_count": 3.0,
                    "max_banner_length": 7.0,
                    "is_windows": 0.0,
                    "is_linux": 1.0,
                    "http_server_count": 2.0,
                    "database_count": 0.0,
                    "windows_domain_admin_count": 0.0,
                    "windows_domain_member_count": 0.0,
                    "port_entropy": 1.584962500721156,
                    "hostname_length": 0.0,
                    "hostname_entropy": 0.0
                }
            }
        ]
    }
    ❯ batea -v -x output.xml
    Traceback (most recent call last):
      File "/home/drdinosaur/.pyenv/versions/3.8.0/bin/batea", line 11, in <module>
        load_entry_point('batea', 'console_scripts', 'batea')()
      File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 722, in __call__
        return self.main(*args, **kwargs)
      File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 697, in main
        rv = self.invoke(ctx)
      File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 895, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 535, in invoke
        return callback(*args, **kwargs)
      File "/home/drdinosaur/batea/batea/__main__.py", line 60, in main
        report.hosts.extend([host for host in xml_parser.load_hosts(file)])
      File "/home/drdinosaur/batea/batea/__main__.py", line 60, in <listcomp>
        report.hosts.extend([host for host in xml_parser.load_hosts(file)])
      File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 29, in load_hosts
        host = self._generate_host(child)
      File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 37, in _generate_host
        ports=self._find_ports(subtree))
      File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 52, in _find_ports
        for port in host.find("ports").findall("port"):
    AttributeError: 'NoneType' object has no attribute 'findall'
    ❯ batea -v -x output_2.xml
    Traceback (most recent call last):
      File "/home/drdinosaur/.pyenv/versions/3.8.0/bin/batea", line 11, in <module>
        load_entry_point('batea', 'console_scripts', 'batea')()
      File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 722, in __call__
        return self.main(*args, **kwargs)
      File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 697, in main
        rv = self.invoke(ctx)
      File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 895, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/drdinosaur/.pyenv/versions/3.8.0/lib/python3.8/site-packages/click/core.py", line 535, in invoke
        return callback(*args, **kwargs)
      File "/home/drdinosaur/batea/batea/__main__.py", line 60, in main
        report.hosts.extend([host for host in xml_parser.load_hosts(file)])
      File "/home/drdinosaur/batea/batea/__main__.py", line 60, in <listcomp>
        report.hosts.extend([host for host in xml_parser.load_hosts(file)])
      File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 29, in load_hosts
        host = self._generate_host(child)
      File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 37, in _generate_host
        ports=self._find_ports(subtree))
      File "/home/drdinosaur/batea/batea/core/nmap_parser.py", line 52, in _find_ports
        for port in host.find("ports").findall("port"):
    AttributeError: 'NoneType' object has no attribute 'findall'
    

    The tool is able to handle an output with one host, but errors out with 9 or 10 host files.

    opened by DrDinosaur 5
  • Issues installing batea

    Issues installing batea

    When i tried to install batea i got the below error and I am unsure why and I am running it on Darwin HQSML-1689616 19.6.0 Darwin Kernel Version 19.6.0: Thu Jun 18 20:49:00 PDT 2020; root:xnu-6153.141.1~1/RELEASE_X86_64 x86_64

    python3 setup.py sdist; pip3 install -r requirements.txt; pip3 install -e .;
    running sdist
    running egg_info
    error: [Errno 13] Permission denied
    Requirement already satisfied: defusedxml==0.6.0 in /usr/local/lib/python3.8/site-packages (from -r requirements.txt (line 1)) (0.6.0)
    Requirement already satisfied: numpy==1.17.2 in /usr/local/lib/python3.8/site-packages (from -r requirements.txt (line 2)) (1.17.2)
    Requirement already satisfied: click==6.7 in /usr/local/lib/python3.8/site-packages (from -r requirements.txt (line 3)) (6.7)
    Collecting scikit-learn==0.21.3
      Using cached scikit-learn-0.21.3.tar.gz (12.2 MB)
    Processing /Users/gbiago909/Library/Caches/pip/wheels/5f/a4/f5/58f4823fa59f34b12e2ad34a5f05d443346f42aa486a4df1aa/pandas-0.25.0-cp38-cp38-macosx_10_15_x86_64.whl
    Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.8/site-packages (from scikit-learn==0.21.3->-r requirements.txt (line 4)) (1.5.2)
    Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.8/site-packages (from scikit-learn==0.21.3->-r requirements.txt (line 4)) (0.16.0)
    Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.8/site-packages (from pandas==0.25.0->-r requirements.txt (line 5)) (2020.1)
    Requirement already satisfied: python-dateutil>=2.6.1 in /usr/local/lib/python3.8/site-packages (from pandas==0.25.0->-r requirements.txt (line 5)) (2.8.1)
    Requirement already satisfied: six>=1.5 in /usr/local/Cellar/protobuf/3.12.4/libexec/lib/python3.8/site-packages (from python-dateutil>=2.6.1->pandas==0.25.0->-r requirements.txt (line 5)) (1.15.0)
    Building wheels for collected packages: scikit-learn
      Building wheel for scikit-learn (setup.py) ... error
      ERROR: Command errored out with exit status 1:
       command: /usr/local/opt/[email protected]/bin/python3.8 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py'"'"'; __file__='"'"'/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-wheel-1_qd4w7j
           cwd: /private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/
      Complete output (28 lines):
      Partial import of sklearn during the build process.
      Traceback (most recent call last):
        File "<string>", line 1, in <module>
        File "/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py", line 290, in <module>
          setup_package()
        File "/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py", line 286, in setup_package
          setup(**metadata)
        File "/usr/local/lib/python3.8/site-packages/numpy/distutils/core.py", line 137, in setup
          config = configuration()
        File "/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py", line 174, in configuration
          config.add_subpackage('sklearn')
        File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 1033, in add_subpackage
          config_list = self.get_subpackage(subpackage_name, subpackage_path,
        File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 999, in get_subpackage
          config = self._get_configuration_from_setup_py(
        File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 941, in _get_configuration_from_setup_py
          config = setup_module.configuration(*args)
        File "sklearn/setup.py", line 62, in configuration
          config.add_subpackage('utils')
        File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 1033, in add_subpackage
          config_list = self.get_subpackage(subpackage_name, subpackage_path,
        File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 999, in get_subpackage
          config = self._get_configuration_from_setup_py(
        File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 941, in _get_configuration_from_setup_py
          config = setup_module.configuration(*args)
        File "sklearn/utils/setup.py", line 8, in configuration
          from Cython import Tempita
      ModuleNotFoundError: No module named 'Cython'
      ----------------------------------------
      ERROR: Failed building wheel for scikit-learn
      Running setup.py clean for scikit-learn
    Failed to build scikit-learn
    Installing collected packages: scikit-learn, pandas
        Running setup.py install for scikit-learn ... error
        ERROR: Command errored out with exit status 1:
         command: /usr/local/opt/[email protected]/bin/python3.8 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py'"'"'; __file__='"'"'/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-record-t3su3ll1/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/scikit-learn
             cwd: /private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/
        Complete output (28 lines):
        Partial import of sklearn during the build process.
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py", line 290, in <module>
            setup_package()
          File "/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py", line 286, in setup_package
            setup(**metadata)
          File "/usr/local/lib/python3.8/site-packages/numpy/distutils/core.py", line 137, in setup
            config = configuration()
          File "/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py", line 174, in configuration
            config.add_subpackage('sklearn')
          File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 1033, in add_subpackage
            config_list = self.get_subpackage(subpackage_name, subpackage_path,
          File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 999, in get_subpackage
            config = self._get_configuration_from_setup_py(
          File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 941, in _get_configuration_from_setup_py
            config = setup_module.configuration(*args)
          File "sklearn/setup.py", line 62, in configuration
            config.add_subpackage('utils')
          File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 1033, in add_subpackage
            config_list = self.get_subpackage(subpackage_name, subpackage_path,
          File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 999, in get_subpackage
            config = self._get_configuration_from_setup_py(
          File "/usr/local/lib/python3.8/site-packages/numpy/distutils/misc_util.py", line 941, in _get_configuration_from_setup_py
            config = setup_module.configuration(*args)
          File "sklearn/utils/setup.py", line 8, in configuration
            from Cython import Tempita
        ModuleNotFoundError: No module named 'Cython'
        ----------------------------------------
    ERROR: Command errored out with exit status 1: /usr/local/opt/[email protected]/bin/python3.8 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py'"'"'; __file__='"'"'/private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-install-vxaj13m8/scikit-learn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /private/var/folders/j5/t3kln5r959vc5bfr4hp20vmh0000gn/T/pip-record-t3su3ll1/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/scikit-learn Check the logs for full command output.
    Obtaining file:///opt/batea
    Requirement already satisfied: defusedxml==0.6.0 in /usr/local/lib/python3.8/site-packages (from batea==0.0.1) (0.6.0)
    Installing collected packages: batea
      Attempting uninstall: batea
        Found existing installation: batea 0.0.1
        Uninstalling batea-0.0.1:
    ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/lib/python3.8/site-packages/easy-install.pth'
    Consider using the `--user` option or check the permissions.
    

    Output from when i tru to run it

    Traceback (most recent call last):
      File "/usr/local/bin/batea", line 33, in <module>
        sys.exit(load_entry_point('batea==0.0.1', 'console_scripts', 'batea')())
      File "/usr/local/bin/batea", line 25, in importlib_load_entry_point
        return next(matches).load()
      File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/importlib/metadata.py", line 77, in load
        module = import_module(match.group('module'))
      File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
      File "<frozen importlib._bootstrap>", line 991, in _find_and_load
      File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
      File "<frozen importlib._bootstrap>", line 991, in _find_and_load
      File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
      File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
      File "<frozen zipimport>", line 259, in load_module
      File "/usr/local/lib/python3.8/site-packages/batea-0.0.1-py3.8.egg/batea/__init__.py", line 18, in <module>
    ModuleNotFoundError: No module named 'batea.core'
    

    I am having the same errors on my ubuntu server which i retuned to be like kali Linux kali 4.15.0-118-generic #119-Ubuntu SMP Tue Sep 8 12:30:01 UTC 2020 x86_64 GNU/Linux

    opened by gbiagomba 4
  • ValueError: numpy.ndarray size changed

    ValueError: numpy.ndarray size changed

    When installing Batea in an venv I get the following error if I try to execute it:

    ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
    
    opened by Anth0rx 3
  • AttributeError: 'NoneType' object has no attribute 'findall'

    AttributeError: 'NoneType' object has no attribute 'findall'

    Traceback (most recent call last):
      File "/usr/local/bin/batea", line 11, in <module>
        load_entry_point('batea', 'console_scripts', 'batea')()
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 722, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 697, in main
        rv = self.invoke(ctx)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 895, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 535, in invoke
        return callback(*args, **kwargs)
      File "/Users/user/Documents/Projects/batea/batea/__main__.py", line 51, in main
        report.hosts.extend([host for host in xml_parser.load_hosts(file)])
      File "/Users/user/Documents/Projects/batea/batea/__main__.py", line 51, in <listcomp>
        report.hosts.extend([host for host in xml_parser.load_hosts(file)])
      File "/Users/user/Documents/Projects/batea/batea/core/nmap_parser.py", line 29, in load_hosts
        host = self._generate_host(child)
      File "/Users/user/Documents/Projects/batea/batea/core/nmap_parser.py", line 37, in _generate_host
        ports=self._find_ports(subtree))
      File "/Users/user/Documents/Projects/batea/batea/core/nmap_parser.py", line 52, in _find_ports
        for port in host.find("ports").findall("port"):
    AttributeError: 'NoneType' object has no attribute 'findall'
    
    opened by ghost 2
  • Error running batea

    Error running batea

    Hi, i have just tried to install and run the project, with or without sudo, and I am getting errors from numpy. I am running kali

    `┌──(kali㉿kali)-[~/git/batea] └─$ batea -v output.xml RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd Traceback (most recent call last): File "/home/kali/.local/bin/batea", line 33, in sys.exit(load_entry_point('batea', 'console_scripts', 'batea')()) File "/home/kali/.local/bin/batea", line 25, in importlib_load_entry_point return next(matches).load() File "/usr/lib/python3.9/importlib/metadata.py", line 86, in load module = import_module(match.group('module')) File "/usr/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 972, in _find_and_load_unlocked File "", line 228, in _call_with_frames_removed File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/home/kali/git/batea/batea/init.py", line 18, in from .core.nmap_parser import NmapReportParser File "/home/kali/git/batea/batea/core/init.py", line 22, in from .model import BateaModel File "/home/kali/git/batea/batea/core/model.py", line 18, in from sklearn.ensemble import IsolationForest File "/home/kali/.local/lib/python3.9/site-packages/sklearn/init.py", line 80, in from .base import clone File "/home/kali/.local/lib/python3.9/site-packages/sklearn/base.py", line 21, in from .utils import _IS_32BIT File "/home/kali/.local/lib/python3.9/site-packages/sklearn/utils/init.py", line 20, in from scipy.sparse import issparse File "/usr/lib/python3/dist-packages/scipy/sparse/init.py", line 228, in from .csr import * File "/usr/lib/python3/dist-packages/scipy/sparse/csr.py", line 10, in from ._sparsetools import (csr_tocsc, csr_tobsr, csr_count_blocks, ImportError: numpy.core.multiarray failed to import

    ┌──(kali㉿kali)-[~/git/batea] └─$ cd ..

    ┌──(kali㉿kali)-[~/git] └─$ rm -rf batea

    ┌──(kali㉿kali)-[~/git] └─$ git clone https://github.com/delvelabs/batea.git
    Cloning into 'batea'... remote: Enumerating objects: 258, done. remote: Counting objects: 100% (68/68), done. remote: Compressing objects: 100% (23/23), done. remote: Total 258 (delta 59), reused 45 (delta 45), pack-reused 190 Receiving objects: 100% (258/258), 73.92 KiB | 764.00 KiB/s, done. Resolving deltas: 100% (144/144), done.

    ┌──(kali㉿kali)-[~/git] └─$ cd batea

    ┌──(kali㉿kali)-[~/git/batea] └─$ sudo python3 setup.py sdist
    [sudo] password for kali: running sdist running egg_info creating batea.egg-info writing batea.egg-info/PKG-INFO writing dependency_links to batea.egg-info/dependency_links.txt writing entry points to batea.egg-info/entry_points.txt writing requirements to batea.egg-info/requires.txt writing top-level names to batea.egg-info/top_level.txt writing manifest file 'batea.egg-info/SOURCES.txt' reading manifest file 'batea.egg-info/SOURCES.txt' adding license file 'LICENSE.md' writing manifest file 'batea.egg-info/SOURCES.txt' running check creating batea-0.0.1 creating batea-0.0.1/batea creating batea-0.0.1/batea.egg-info copying files to batea-0.0.1... copying LICENSE.md -> batea-0.0.1 copying README.md -> batea-0.0.1 copying setup.cfg -> batea-0.0.1 copying setup.py -> batea-0.0.1 copying batea/init.py -> batea-0.0.1/batea copying batea/main.py -> batea-0.0.1/batea copying batea/version.py -> batea-0.0.1/batea copying batea.egg-info/PKG-INFO -> batea-0.0.1/batea.egg-info copying batea.egg-info/SOURCES.txt -> batea-0.0.1/batea.egg-info copying batea.egg-info/dependency_links.txt -> batea-0.0.1/batea.egg-info copying batea.egg-info/entry_points.txt -> batea-0.0.1/batea.egg-info copying batea.egg-info/requires.txt -> batea-0.0.1/batea.egg-info copying batea.egg-info/top_level.txt -> batea-0.0.1/batea.egg-info Writing batea-0.0.1/setup.cfg creating dist Creating tar archive removing 'batea-0.0.1' (and everything under it)

    ┌──(kali㉿kali)-[~/git/batea] └─$ sudo pip3 install -r requirements.txt Collecting defusedxml==0.6.0 Downloading defusedxml-0.6.0-py2.py3-none-any.whl (23 kB) Collecting numpy==1.19.4 Downloading numpy-1.19.4-cp39-cp39-manylinux2010_x86_64.whl (14.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.5/14.5 MB 10.2 MB/s eta 0:00:00 Collecting click==7.1.2 Downloading click-7.1.2-py2.py3-none-any.whl (82 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 82.8/82.8 KB 15.0 MB/s eta 0:00:00 Collecting scikit-learn==0.23.2 Downloading scikit-learn-0.23.2.tar.gz (7.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.2/7.2 MB 9.4 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting pandas==1.1.4 Downloading pandas-1.1.4-cp39-cp39-manylinux1_x86_64.whl (9.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.3/9.3 MB 10.1 MB/s eta 0:00:00 Collecting joblib>=0.11 Downloading joblib-1.1.0-py2.py3-none-any.whl (306 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.0/307.0 KB 7.9 MB/s eta 0:00:00 Collecting threadpoolctl>=2.0.0 Downloading threadpoolctl-3.1.0-py3-none-any.whl (14 kB) Requirement already satisfied: scipy>=0.19.1 in /usr/lib/python3/dist-packages (from scikit-learn==0.23.2->-r requirements.txt (line 4)) (1.7.3) Requirement already satisfied: python-dateutil>=2.7.3 in /usr/lib/python3/dist-packages (from pandas==1.1.4->-r requirements.txt (line 5)) (2.8.1) Requirement already satisfied: pytz>=2017.2 in /usr/lib/python3/dist-packages (from pandas==1.1.4->-r requirements.txt (line 5)) (2022.1) Building wheels for collected packages: scikit-learn Building wheel for scikit-learn (pyproject.toml) ... done Created wheel for scikit-learn: filename=scikit_learn-0.23.2-cp39-cp39-linux_x86_64.whl size=21712114 sha256=931033a0fc5847f2cf85d6f29d0aa88d849cf99df627e5b4a514a7316ececd6b Stored in directory: /root/.cache/pip/wheels/5e/74/24/7e235ccf01765c0daa089c98cc823e9dc1383da5fe0ed7e224 Successfully built scikit-learn Installing collected packages: threadpoolctl, numpy, joblib, defusedxml, click, scikit-learn, pandas Attempting uninstall: numpy Found existing installation: numpy 1.21.5 Not uninstalling numpy at /usr/lib/python3/dist-packages, outside environment /usr Can't uninstall 'numpy'. No files were found to uninstall. Attempting uninstall: defusedxml Found existing installation: defusedxml 0.7.1 Not uninstalling defusedxml at /usr/lib/python3/dist-packages, outside environment /usr Can't uninstall 'defusedxml'. No files were found to uninstall. Attempting uninstall: click Found existing installation: click 8.0.3 Not uninstalling click at /usr/lib/python3/dist-packages, outside environment /usr Can't uninstall 'click'. No files were found to uninstall. Attempting uninstall: pandas Found existing installation: pandas 1.3.5 Not uninstalling pandas at /usr/lib/python3/dist-packages, outside environment /usr Can't uninstall 'pandas'. No files were found to uninstall. Successfully installed click-7.1.2 defusedxml-0.6.0 joblib-1.1.0 numpy-1.19.4 pandas-1.1.4 scikit-learn-0.23.2 threadpoolctl-3.1.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

    ┌──(kali㉿kali)-[~/git/batea] └─$ sudo pip3 install -e .
    Obtaining file:///home/kali/git/batea Preparing metadata (setup.py) ... done Requirement already satisfied: defusedxml==0.6.0 in /usr/local/lib/python3.9/dist-packages (from batea==0.0.1) (0.6.0) Installing collected packages: batea Running setup.py develop for batea Successfully installed batea-0.0.1 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

    ┌──(kali㉿kali)-[~/git/batea] └─$ cp ../../output.xml .

    ┌──(kali㉿kali)-[~/git/batea] └─$ sudo batea -v output.xml RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd Traceback (most recent call last): File "/usr/local/bin/batea", line 33, in sys.exit(load_entry_point('batea', 'console_scripts', 'batea')()) File "/usr/local/bin/batea", line 25, in importlib_load_entry_point return next(matches).load() File "/usr/lib/python3.9/importlib/metadata.py", line 86, in load module = import_module(match.group('module')) File "/usr/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 972, in _find_and_load_unlocked File "", line 228, in _call_with_frames_removed File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/home/kali/git/batea/batea/init.py", line 18, in from .core.nmap_parser import NmapReportParser File "/home/kali/git/batea/batea/core/init.py", line 22, in from .model import BateaModel File "/home/kali/git/batea/batea/core/model.py", line 18, in from sklearn.ensemble import IsolationForest File "/usr/local/lib/python3.9/dist-packages/sklearn/init.py", line 80, in from .base import clone File "/usr/local/lib/python3.9/dist-packages/sklearn/base.py", line 21, in from .utils import _IS_32BIT File "/usr/local/lib/python3.9/dist-packages/sklearn/utils/init.py", line 20, in from scipy.sparse import issparse File "/usr/lib/python3/dist-packages/scipy/sparse/init.py", line 228, in from .csr import * File "/usr/lib/python3/dist-packages/scipy/sparse/csr.py", line 10, in from ._sparsetools import (csr_tocsc, csr_tobsr, csr_count_blocks, ImportError: numpy.core.multiarray failed to import

    ┌──(kali㉿kali)-[~/git/batea] └─$ lsb_release -r
    Release: 2022.1

    ┌──(kali㉿kali)-[~/git/batea] └─$ lsb_release -ra No LSB modules are available. Distributor ID: Kali Description: Kali GNU/Linux Rolling Release: 2022.1 Codename: kali-rolling `

    It looks like requirements process does not uninstall numpy version and install the good one, right?

    but, if I check installed dependencies with required ones, I think I have no problem:

    `┌──(kali㉿kali)-[~/git/batea] └─$ cat requirements.txt defusedxml==0.6.0 numpy==1.19.4 click==7.1.2 scikit-learn==0.23.2 pandas==1.1.4

    ┌──(kali㉿kali)-[~/git/batea] └─$ python3 -m pip show numpy
    Name: numpy Version: 1.19.4 Summary: NumPy is the fundamental package for array computing with Python. Home-page: https://www.numpy.org Author: Travis E. Oliphant et al. Author-email: License: BSD Location: /home/kali/.local/lib/python3.9/site-packages Requires: Required-by: ImageHash, pandas, PyWavelets, scikit-learn

    ┌──(kali㉿kali)-[~/git/batea] └─$ python3 -m pip show defusedxml Name: defusedxml Version: 0.6.0 Summary: XML bomb protection for Python stdlib modules Home-page: https://github.com/tiran/defusedxml Author: Christian Heimes Author-email: [email protected] License: PSFL Location: /home/kali/.local/lib/python3.9/site-packages Requires: Required-by: batea

    ┌──(kali㉿kali)-[~/git/batea] └─$ python3 -m pip show click
    Name: click Version: 7.1.2 Summary: Composable command line interface toolkit Home-page: https://palletsprojects.com/p/click/ Author: Author-email: License: BSD-3-Clause Location: /home/kali/.local/lib/python3.9/site-packages Requires: Required-by: uvicorn

    ┌──(kali㉿kali)-[~/git/batea] └─$ python3 -m pip show scikit-learn Name: scikit-learn Version: 0.23.2 Summary: A set of python modules for machine learning and data mining Home-page: http://scikit-learn.org Author: Author-email: License: new BSD Location: /home/kali/.local/lib/python3.9/site-packages Requires: joblib, numpy, scipy, threadpoolctl Required-by:

    ┌──(kali㉿kali)-[~/git/batea] └─$ python3 -m pip show pandas
    Name: pandas Version: 1.1.4 Summary: Powerful data structures for data analysis, time series, and statistics Home-page: https://pandas.pydata.org Author: Author-email: License: BSD Location: /home/kali/.local/lib/python3.9/site-packages Requires: numpy, python-dateutil, pytz Required-by: `

    so, could you help me please?

    opened by alonsoir 1
  • Unusual ports feature

    Unusual ports feature

    Thoughts on having a feature about unusual or hacker related ports? I was thinking about ports like 4444, 1337, 31337. The first is Metasploit's default port while the second two are "hacker" port numbers. I think these would be worth investigating if I saw these in a scan since they aren't normally used.

    opened by DrDinosaur 1
  • Asset without port bug fix

    Asset without port bug fix

    Pandas automatically convert columns with integer and null values as float, but the numpy csv ready only reads string. So the CSV parser will transform string into float then integer for port number as it is the most general order of operation.

    opened by SergeOlivierP 0
  • Bump numpy from 1.19.4 to 1.22.0

    Bump numpy from 1.19.4 to 1.22.0

    Bumps numpy from 1.19.4 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Support MAC Address in training data

    Support MAC Address in training data

    Feature idea: Using MAC address might help to reveal faked random MAC or vendors that are not supposed to show up in your network or are obscure IoT embedded devices.

    opened by RRAlex 1
Owner
Secureworks Taegis VDR
Automatically identify and prioritize vulnerabilities for intelligent remediation.
Secureworks Taegis VDR
Unofficial implementation of Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segmentation

Point-Unet This is an unofficial implementation of the MICCAI 2021 paper Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segment

Namt0d 9 Dec 7, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 8, 2023
:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

huybery 60 Dec 31, 2022
CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal

Zhiwu Qing 63 Sep 27, 2022
Context Axial Reverse Attention Network for Small Medical Objects Segmentation

CaraNet: Context Axial Reverse Attention Network for Small Medical Objects Segmentation This repository contains the implementation of a novel attenti

null 401 Dec 23, 2022
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

null 21 Nov 9, 2022
Container : Context Aggregation Network

Container : Context Aggregation Network If you use this code for a paper please cite: @article{gao2021container, title={Container: Context Aggregati

AI2 47 Dec 16, 2022
Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

CDFI (Compression-Driven-Frame-Interpolation) [Paper] (Coming soon...) | [arXiv] Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov IEEE Conference

Tianyu Ding 95 Dec 4, 2022
InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images Hong Wang, Yuexiang Li, Haimiao Zhang, Deyu Men

Hong Wang 4 Dec 27, 2022
RCDNet: A Model-driven Deep Neural Network for Single Image Rain Removal (CVPR2020)

RCDNet: A Model-driven Deep Neural Network for Single Image Rain Removal (CVPR2020) Hong Wang, Qi Xie, Qian Zhao, and Deyu Meng [PDF] [Supplementary M

Hong Wang 6 Sep 27, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 4, 2023
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 5.7k Feb 12, 2021
Learning embeddings for classification, retrieval and ranking.

StarSpace StarSpace is a general-purpose neural model for efficient learning of entity embeddings for solving a wide variety of problems: Learning wor

Facebook Research 3.8k Dec 22, 2022
ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin et al., 2020).

ReConsider ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin

Facebook Research 47 Jul 26, 2022
Fast, differentiable sorting and ranking in PyTorch

Torchsort Fast, differentiable sorting and ranking in PyTorch. Pure PyTorch implementation of Fast Differentiable Sorting and Ranking (Blondel et al.)

Teddy Koker 655 Jan 4, 2023
Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

MetaAdaptRank This repository provides the implementation of meta-learning to reweight synthetic weak supervision data described in the paper Few-Shot

THUNLP 5 Jun 16, 2022
Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness

Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness This repository contains the code used for the exper

H.R. Oosterhuis 28 Nov 29, 2022
TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

null 2.6k Jan 4, 2023
Ranking Models in Unlabeled New Environments (iccv21)

Ranking Models in Unlabeled New Environments Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch 1.7.0 + torchivision 0.8.1

null 14 Dec 17, 2021