Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch with a familiar Pandas-compatible API.

elastic

Last update: Dec 30, 2022

Related tags

Search python elasticsearch machine-learning big-data etl scikit-learn pandas lightgbm data-analysis dataframe dataframes time-series-forecasting eland

Overview

About

Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch with a familiar Pandas-compatible API.

Where possible the package uses existing Python APIs and data structures to make it easy to switch between numpy, pandas, scikit-learn to their Elasticsearch powered equivalents. In general, the data resides in Elasticsearch and not in memory, which allows Eland to access large datasets stored in Elasticsearch.

Eland also provides tools to upload trained machine learning models from your common libraries like scikit-learn, XGBoost, and LightGBM into Elasticsearch.

Getting Started

Eland can be installed from PyPI with Pip:

$ python -m pip install eland

Eland can also be installed from Conda Forge with Conda:

$ conda install -c conda-forge eland

Compatibility

Supports Python 3.7+ and Pandas 1.3
Supports Elasticsearch clusters that are 7.11+, recommended 7.14 or later for all features to work.

Connecting to Elasticsearch

Eland uses the Elasticsearch low level client to connect to Elasticsearch. This client supports a range of connection options and authentication options.

You can pass either an instance of elasticsearch.Elasticsearch to Eland APIs or a string containing the host to connect to:

") ) df = ed.DataFrame(es, es_index_pattern="flights") ">

import eland as ed

# Connecting to an Elasticsearch instance running on 'localhost:9200'
df = ed.DataFrame("localhost:9200", es_index_pattern="flights")

# Connecting to an Elastic Cloud instance
from elasticsearch import Elasticsearch

es = Elasticsearch(
    cloud_id="cluster-name:...",
    http_auth=("elastic", "
   
    "
   )
)
df = ed.DataFrame(es, es_index_pattern="flights")

DataFrames in Eland

eland.DataFrame wraps an Elasticsearch index in a Pandas-like API and defers all processing and filtering of data to Elasticsearch instead of your local machine. This means you can process large amounts of data within Elasticsearch from a Jupyter Notebook without overloading your machine.

➤ Eland DataFrame API documentation

➤ Advanced examples in a Jupyter Notebook

900.0) & (df.Cancelled == True)].head() AvgTicketPrice Cancelled ... dayOfWeek timestamp 8 960.869736 True ... 0 2018-01-01 12:09:35 26 975.812632 True ... 0 2018-01-01 15:38:32 311 946.358410 True ... 0 2018-01-01 11:51:12 651 975.383864 True ... 2 2018-01-03 21:13:17 950 907.836523 True ... 2 2018-01-03 05:14:51 [5 rows x 27 columns] # Running aggregations across an index >>> df[['DistanceKilometers', 'AvgTicketPrice']].aggregate(['sum', 'min', 'std']) DistanceKilometers AvgTicketPrice sum 9.261629e+07 8.204365e+06 min 0.000000e+00 1.000205e+02 std 4.578263e+03 2.663867e+02 ">

>>> import eland as ed

>>> # Connect to 'flights' index via localhost Elasticsearch node
>>> df = ed.DataFrame('localhost:9200', 'flights')

# eland.DataFrame instance has the same API as pandas.DataFrame
# except all data is in Elasticsearch. See .info() memory usage.
>>> df.head()
   AvgTicketPrice  Cancelled  ... dayOfWeek           timestamp
0      841.265642      False  ...         0 2018-01-01 00:00:00
1      882.982662      False  ...         0 2018-01-01 18:27:00
2      190.636904      False  ...         0 2018-01-01 17:11:14
3      181.694216       True  ...         0 2018-01-01 10:33:28
4      730.041778      False  ...         0 2018-01-01 05:13:00

[5 rows x 27 columns]

>>> df.info()
<class 'eland.dataframe.DataFrame'>
Index: 13059 entries, 0 to 13058
Data columns (total 27 columns):
 #   Column              Non-Null Count  Dtype         
---  ------              --------------  -----         
 0   AvgTicketPrice      13059 non-null  float64       
 1   Cancelled           13059 non-null  bool          
 2   Carrier             13059 non-null  object        
...      
 24  OriginWeather       13059 non-null  object        
 25  dayOfWeek           13059 non-null  int64         
 26  timestamp           13059 non-null  datetime64[ns]
dtypes: bool(2), datetime64[ns](1), float64(5), int64(2), object(17)
memory usage: 80.0 bytes
Elasticsearch storage usage: 5.043 MB

# Filtering of rows using comparisons
>>> df[(df.Carrier=="Kibana Airlines") & (df.AvgTicketPrice > 900.0) & (df.Cancelled == True)].head()
     AvgTicketPrice  Cancelled  ... dayOfWeek           timestamp
8        960.869736       True  ...         0 2018-01-01 12:09:35
26       975.812632       True  ...         0 2018-01-01 15:38:32
311      946.358410       True  ...         0 2018-01-01 11:51:12
651      975.383864       True  ...         2 2018-01-03 21:13:17
950      907.836523       True  ...         2 2018-01-03 05:14:51

[5 rows x 27 columns]

# Running aggregations across an index
>>> df[['DistanceKilometers', 'AvgTicketPrice']].aggregate(['sum', 'min', 'std'])
     DistanceKilometers  AvgTicketPrice
sum        9.261629e+07    8.204365e+06
min        0.000000e+00    1.000205e+02
std        4.578263e+03    2.663867e+02

Machine Learning in Eland

Eland allows transforming trained models from scikit-learn, XGBoost, and LightGBM libraries to be serialized and used as an inference model in Elasticsearch

➤ Eland Machine Learning API documentation

>> xgb_model.fit(training_data[0], training_data[1]) >>> xgb_model.predict(training_data[0]) [0 1 1 0 1 0 0 0 1 0] # Import the model into Elasticsearch >>> es_model = MLModel.import_model( es_client="localhost:9200", model_id="xgb-classifier", model=xgb_model, feature_names=["f0", "f1", "f2", "f3", "f4"], ) # Exercise the ML model in Elasticsearch with the training data >>> es_model.predict(training_data[0]) [0 1 1 0 1 0 0 0 1 0] ">

>>> from xgboost import XGBClassifier
>>> from eland.ml import MLModel

# Train and exercise an XGBoost ML model locally
>>> xgb_model = XGBClassifier(booster="gbtree")
>>> xgb_model.fit(training_data[0], training_data[1])

>>> xgb_model.predict(training_data[0])
[0 1 1 0 1 0 0 0 1 0]

# Import the model into Elasticsearch
>>> es_model = MLModel.import_model(
    es_client="localhost:9200",
    model_id="xgb-classifier",
    model=xgb_model,
    feature_names=["f0", "f1", "f2", "f3", "f4"],
)

# Exercise the ML model in Elasticsearch with the training data
>>> es_model.predict(training_data[0])
[0 1 1 0 1 0 0 0 1 0]

Comments

Add `iterrows()` and `itertuples()` DataFrame API, its usage is similar to `pandas`
Related to this issues: Can we get batch data use df.to_pandas() in the case of big data? close #345

Add to_pandas_in_batch() DataFrame API

Then, We can use below code to get batch dataframe in the case of big data

pd_df_iterator = ed_df.to_pandas_in_batch(batch_size=1000) for pd_df in pd_df_iterator: print(pd_df)

If there code is something wrong, please give some suggestions. Thank you!
opened by kxbin 25
Fix issues following update to pandas 1.0.1
Change _stringify_path to stringify_path (make it public)

Fix info() formatting

Update output in notebooks cells and doctest strings

Keep the original dtype for some aggregate functions (min, max) by adding an optional parameter (keep_domain)

Closes #124
opened by mesejo 20
Add `iterrows()` and `itertuples()` DataFrame API, its usage is similar to `pandas`
I'm sorry about this, because I want to squash multiple commits, and accidentally modified the branch name. So that the PR was automatically closed unexpectedly.

For historical Conversation see PR #369

@sethmlarson Thanks for your comment Based on these suggestions, I have completed the modification, And passed the lint and docs jobs.

Thanks to this change in #379, no need to convert _es_results_to_pandas into a generator now, because they have similar effects. Now, the performance has been further improved and the logic is more concise.

Finally, the same 50,000 data sets, my test results are as follows:

ed.iterrows(), It took a total of `19 seconds` after the iteration ed.itertuples(), It took a total of `15 seconds` after the iteration ed.to_pandas(), It took `15 seconds`

Closes https://github.com/elastic/eland/issues/345
opened by kxbin 19
Added support for 2 date formats:
This PR addresses #22 by adding support for 2 date formats:

epoch_millis

epoch_seconds

When the mapping are fetched each date type will carry additional format information. Ex. date[epoch_millis], date[epoch_second], etc.

This will later on be converted to pandas types according to their format: datetime64[ms], datetime64[s]
enhancement
opened by viglia 19
Add quantile to DataFrame and Series
Closes #315

Functionality w.r.t pandas for quantile is implemented i.e. df.agg['quantile',...] , df.quantile() , Series.quantile()

Added tests and documentation.

@sethmlarson Please Review 😄

P.S. Need to change commit message while merging to master from percentile to quantile 😃
opened by V1NAY8 14
Refactor tests
Changes made are:

Refactored eland/tests/* to test_eland/* and imports used in tests.

Updated noxfile.py

Added code snippets to contributing.rst

All tests (Including tests, doctests) were run and successful

@sethmlarson Please review, Happy to make any changes required 😄
opened by V1NAY8 14
[ML] adding inference results tests for pytorch transformer models
This commit adds many integration tests for

downloading PyTorch models

Processing them locally prepping for Elasticsearch uploading

Uploading and deploying to Elasticsearch

Confirming inference results on the models.

The tests cover all our current NLP tasks over only some of our supported models.

The models chosen to test were chosen for the broadest coverage over our supported tokenization and model wrapped types.
ci topic:ml
opened by benwtrent 13
Add nbval to CI and notebook examples
Closes #88 The following changes are made:

Modified noxfile.py to include nbval tests which are executed when CI is run.

Added some notebook tests which are

# doctest: +SKIP tests in tests folder

A demo notebook of all basic functionality

ETL notebook which consists of tests for show_progress parameter

Metrics notebook which consists of median, mad => numeric_only tests

@sethmlarson Please review and let me know if any tests are to be added or If existing ones are not necessary.

License headers are not included for .ipynb or .csv files, Is that Acceptable ?
opened by V1NAY8 12
Handle datetime types in comparison filters
Fixes issue #265, partially implements #236

I implemented the handling of python's datetime objects when comparing with series. The datetime objects will be converted to a string in the ElasticSearch format "strict_date_optional_time" (if there is a more suitable format, let me know).

Todos:

[x] Handle datetime objects for less, less-equal, greater, greater-equal, equal and not-equal comparison

[x] Handle np.datetime64 objects for less, less-equal, greater, greater-equal, equal and not-equal comparison

[x] Write tests

Tell me if I missed some types that should be supported as well!

If this PR is helpful, I would appreciate this PR to be labeled with "hacktoberfest-accepted" (:

Cheers :v:
hacktoberfest-accepted
opened by Fju 12
Switch agg defaults to numeric_only=None
Closes #254

Added numeric_only = True | False | None for all agg and aggs (except nunique)

Added logic where booleans are not supported by median_absolute_deviation

Added tests, doctests for every change made, also modified existing tests to work with these changes.

Nox and pytest sessions are successful.
opened by V1NAY8 12
Add mode to dataframe and series
Closes #215

Added mode to dataframe and series

Added tests, documentation

Currently mode isn't supported by pandas groupby too, Hence added a NotImplementedError

To satisfy mypy I had to add type hints to some other methods too.

Implemented es_size parameter instead of dumping the entire cluster if we have multiple mode values.

Currently dropna=True is only supported because I have a query on missing parameter of terms aggregation Reference

@sethmlarson Please review 😃
opened by V1NAY8 10
Remind users to sync saved objects in order for training model to show up in Kibana

We should consider automatically synchronizing Kibana saved objects after using eland_import_hub_model to import a training model to Elasticsearch, or add an output message to the docker run -it --rm --network host elastic/eland eland_import_hub_model script at the end of the process to instruct the user to either call the Kibana API or click on Synchronize saved objects from the ML UI to manually sync the objects in order for the training model to show up.

This will improve the user experience when using the import script.

opened by ppf2 0
How to download only desired entry rows from Eland?

Currently, if I want to create a dataframe with specific entries from an index, using Eland, I must first download a dataframe with all the columns, and then filter them out, locally... this is hardly desirable in terms of CPU and RAM local usage. Is there a way to push the filtering to the ES nodes instead, and simply download the already filtered dataframe?

opened by IavTavares 0
pydata-sphinx-theme has broken our docs
Some options:

Pin the pydata-sphinx-theme version

Change to a theme that's stable

Move all docs to elastic.co (this is what I'd like to do long term)
opened by sethmlarson 0
Document Rust requirement

Some users of the NLP features have reported that Rust is required to install eland. The dependency comes from the fast tokenizers in Hugging Face transformers.

Closes #495
documentation

opened by davidkyle 0
Include pitfall of `--start` in the README

Users who follow the Eland README as a guide to importing models can easily end up seeing inexplicably poor performance due to unknowingly running the model with one allocation and one thread per allocation.

This change spells out the effect of --start and links to alternatives that allow better use of available hardware.
documentation

opened by droberts195 0

Releases(v8.3.0)

v8.3.0(Jul 11, 2022)
Added

Added a new NLP model task type "auto" which infers the task type based on model configuration and architecture (#475)

Changed

Changed required version of 'torch' package to >=1.11.0,<1.12 to match required PyTorch version for Elasticsearch 8.3 (was >=1.9.0,<2) (#479)

Changed the default value of the --task-type parameter for the eland_import_hub_model CLI to be "auto" (#475)

Fixed

Fixed decision tree classifier serialization to account for probabilities (#465)

Fixed PyTorch model quantization (#472)

Source code(tar.gz)
Source code(zip)
eland-8.3.0-py3-none-any.whl(140.30 KB)
eland-8.3.0.tar.gz(115.78 KB)
v8.2.0(May 11, 2022)
Added

Added support for passing Cloud ID via --cloud-id to eland_import_hub_model CLI tool (#462)

Added support for authenticating via --es-username, --es-password, and --es-api-key to the eland_import_hub_model CLI tool (#461)

Added support for XGBoost 1.6 (#458)

Added support for question_answering NLP tasks (#457)

Source code(tar.gz)
Source code(zip)
eland-8.2.0-py3-none-any.whl(139.20 KB)
eland-8.2.0.tar.gz(114.89 KB)
v8.1.0(Mar 31, 2022)
Added

Added support for eland.Series.unique() (#448, contributed by @V1NAY8)

Added --ca-certs and --insecure options to eland_import_hub_model for configuring TLS (#441)

Source code(tar.gz)
Source code(zip)
eland-8.1.0-py3-none-any.whl(134.68 KB)
eland-8.1.0.tar.gz(112.02 KB)
v8.0.0(Feb 10, 2022)
Added

Added support for Natural Language Processing (NLP) models using PyTorch (#394)

Added new extra eland[pytorch] for installing all dependencies needed for PyTorch (#394)

Added a CLI script eland_import_hub_model for uploading HuggingFace models to Elasticsearch (#403)

Added support for v8.0 of the Python Elasticsearch client (#415)

Added a warning if Eland detects it's communicating with an incompatible Elasticsearch version (#419)

Added support for number_samples to LightGBM and Scikit-Learn models (#397, contributed by @V1NAY8)

Added ability to use datetime types for filtering dataframes (#284, contributed by @Fju)

Added pandas datetime64 type to use the Elasticsearch date type (#425, contributed by @Ashton-Sidhu)

Added es_verify_mapping_compatibility parameter to disable schema enforcement with pandas_to_eland (#423, contributed by @Ashton-Sidhu)

Changed

Changed to_pandas() to only use Point-in-Time and search_after instead of using Scroll APIs for pagination.

Source code(tar.gz)
Source code(zip)
eland-8.0.0-py3-none-any.whl(134.03 KB)
eland-8.0.0.tar.gz(9.25 MB)
v8.0.0b1(Dec 16, 2021)
Added

Added support for Natural Language Processing (NLP) models using PyTorch (https://github.com/elastic/eland/pull/394)

Added new extra eland[pytorch] for installing all dependencies needed for PyTorch (https://github.com/elastic/eland/pull/394)

Added a CLI script eland_import_hub_model for uploading HuggingFace models to Elasticsearch (https://github.com/elastic/eland/pull/403)

Added support for v8.0 of the Python Elasticsearch client (https://github.com/elastic/eland/pull/415)

Added a warning if Eland detects it's communicating with an incompatible Elasticsearch version (https://github.com/elastic/eland/pull/419)

Added support for number_samples to LightGBM and Scikit-Learn models (https://github.com/elastic/eland/pull/397, contributed by @V1NAY8)

Changed

Changed to_pandas() to only use Point-in-Time and search_after instead of using Scroll APIs for pagination.

Source code(tar.gz)
Source code(zip)
eland-8.0.0b1-py3-none-any.whl(133.76 KB)
eland-8.0.0b1.tar.gz(9.25 MB)
v7.14.1b1(Aug 30, 2021)
Added

Added support for DataFrame.iterrows() and DataFrame.itertuples() (#380, contributed by @kxbin)

Performance

Simplified result collectors to increase performance transforming Elasticsearch results to pandas (#378, contributed by @V1NAY8)

Changed search pagination function to yield batches of hits (#379)

Source code(tar.gz)
Source code(zip)
eland-7.14.1b1-py3-none-any.whl(124.39 KB)
eland-7.14.1b1.tar.gz(102.74 KB)
v7.14.0b1(Aug 9, 2021)
Added

Added support for Pandas 1.3.x (#362, contributed by @V1NAY8)

Added support for LightGBM 3.x (#362, contributed by @V1NAY8)

Added DataFrame.idxmax() and DataFrame.idxmin() methods (#353, contributed by @V1NAY8)

Added type hints to eland.ndframe and eland.operations (#366, contributed by @V1NAY8)

Removed

Removed support for Pandas <1.2 (#364)

Removed support for Python 3.6 to match Pandas (#364)

Changed

Changed paginated search function to use Point-in-Time and Search After features instead of Scroll when connected to Elasticsearch 7.12+ (#370 and #376, contributed by @V1NAY8)

Optimized the FieldMappings.aggregate_field_name() method (#373, contributed by @V1NAY8)

Source code(tar.gz)
Source code(zip)
eland-7.14.0b1-py3-none-any.whl(123.97 KB)
eland-7.14.0b1.tar.gz(102.71 KB)
v7.13.0b1(Jun 22, 2021)
Added

Added DataFrame.quantile(), Series.quantile(), and DataFrameGroupBy.quantile() aggregations (#318 and #356, contributed by @V1NAY8)

Changed

Changed the error raised when es_index_pattern doesn't point to any indices to be more user-friendly (#346)

Fixed

Fixed a warning about conflicting field types when wildcards are used in es_index_pattern (#346)

Fixed sorting when using DataFrame.groupby() with dropna (#322, contributed by @V1NAY8)

Fixed deprecated usage numpy.int in favor of numpy.int_ (#354, contributed by @V1NAY8)

Source code(tar.gz)
Source code(zip)
eland-7.13.0b1-py3-none-any.whl(120.72 KB)
eland-7.13.0b1.tar.gz(99.19 KB)
7.10.1b1(Jan 12, 2021)
Added

Added support for Pandas 1.2.0 (#336)

Added DataFrame.mode() and Series.mode() aggregation (#323, contributed by @V1NAY8)

Added support for pd.set_option("display.max_rows", None) (#308, contributed by @V1NAY8)

Added Elasticsearch storage usage to df.info() (#321, contributed by @V1NAY8)

Removed

Removed deprecated aliases read_es, read_csv, DataFrame.info_es, and MLModel(overwrite=True) (#331, contributed by @V1NAY8)

Source code(tar.gz)
Source code(zip)
eland-7.10.1b1-py3-none-any.whl(118.00 KB)
eland-7.10.1b1.tar.gz(96.54 KB)
7.10.0b1(Oct 29, 2020)
Added

Added DataFrame.groupby() method with all aggregations (#278, #291, #292, #300 contributed by @V1NAY8)

Added es_match() method to DataFrame and Series for filtering rows with full-text search (#301)

Added support for type hints of the elasticsearch-py package (#295)

Added support for passing dictionaries to es_type_overrides parameter in the pandas_to_eland() function to directly control the field mapping generated in Elasticsearch (#310)

Added es_dtypes property to DataFrame and Series (#285)

Changed

Changed pandas_to_eland() to use the parallel_bulk() helper instead of single-threaded bulk() helper to improve performance (#279, contributed by @V1NAY8)

Changed the es_type_overrides parameter in pandas_to_eland() to raise ValueError if an unknown column is given (#302)

Changed DataFrame.filter() to preserve the order of items (#283, contributed by @V1NAY8)

Changed when setting es_type_overrides={"column": "text"} in pandas_to_eland() will automatically add the column.keyword sub-field so that aggregations are available for the field as well (#310)

Fixed

Fixed Series.__repr__ when the series is empty (#306)

Source code(tar.gz)
Source code(zip)
eland-7.10.0b1-py3-none-any.whl(198.79 KB)
eland-7.10.0b1.tar.gz(128.32 KB)
7.9.1a1(Sep 30, 2020)
Added

Added the predict() method and model_type, feature_names, and results_field properties to MLModel (#266)

Deprecated

Deprecated ImportedMLModel in favor of MLModel.import_model(...) (#266)

Changed

Changed DataFrame aggregations to use numeric_only=None instead of numeric_only=True by default. This is the same behavior as Pandas (#270, contributed by @V1NAY8)

Fixed

Fixed DataFrame.agg() when given a string instead of a list of aggregations will now properly return a Series instead of a DataFrame (#263, contributed by @V1NAY8)

Source code(tar.gz)
Source code(zip)
eland-7.9.1a1-py3-none-any.whl(181.22 KB)
eland-7.9.1a1.tar.gz(110.35 KB)
7.9.0a1(Aug 18, 2020)
7.9.0a1 (2020-08-18)

Added

Added support for Pandas v1.1 (#253)

Added support for LightGBM LGBMRegressor and LGBMClassifier to ImportedMLModel (#247, #252)

Added support for multi:softmax and multi:softprob XGBoost operators to ImportedMLModel (#246)

Added column names to DataFrame.__dir__() for better auto-completion support (#223, contributed by @leonardbinet)

Added support for es_if_exists='append' to pandas_to_eland() (#217)

Added support for aggregating datetimes with nunique and mean (#253)

Added es_compress_model_definition parameter to ImportedMLModel constructor (#220)

Added .size and .ndim properties to DataFrame and Series (#231 and #233)

Added .dtype property to Series (#258)

Added support for using pandas.Series with Series.isin() (#231)

Added type hints to many APIs in DataFrame and Series (#231)

Deprecated

Deprecated the overwrite parameter in favor of es_if_exists in ImportedMLModel constructor (#249, contributed by @V1NAY8)

Changed

Changed aggregations for datetimes to be higher precision when available (#253)

Fixed

Fixed ImportedMLModel.predict() to fail when errors are present in the ingest.simulate response (#220)

Fixed Series.median() aggregation to return a scalar instead of pandas.Series (#253)

Fixed Series.describe() to return a pandas.Series instead of pandas.DataFrame (#258)

Fixed DataFrame.mean() and Series.mean() dtype (#258)

Fixed DataFrame.agg() aggregations when using extended_stats Elasticsearch aggregation (#253)

Source code(tar.gz)
Source code(zip)
eland-7.9.0a1-py3-none-any.whl(178.54 KB)
eland-7.9.0a1.tar.gz(107.78 KB)
7.7.0a1(Aug 12, 2020)
7.7.0a1 (2020-05-20)

Added

Added the package to Conda Forge, install via conda install -c conda-forge eland (#209)

Added DataFrame.sample() and Series.sample() for querying a random sample of data from the index (#196, contributed by @mesejo)

Added Series.isna() and Series.notna() for filtering out missing, NaN or null values from a column (#210, contributed by @mesejo)

Added DataFrame.filter() and Series.filter() for reducing an axis using a sequence of items or a pattern (#212)

Added DataFrame.to_pandas() and Series.to_pandas() for converting an Eland dataframe or series into a Pandas dataframe or series inline (#208)

Added support for XGBoost v1.0.0 (#200)

Deprecated

Deprecated info_es() in favor of es_info() (#208)

Deprecated eland.read_csv() in favor of eland.csv_to_eland() (#208)

Deprecated eland.read_es() in favor of eland.DataFrame() (#208)

Changed

Changed var and std aggregations to use sample instead of population in line with Pandas (#185)

Changed painless scripts to use source rather than inline to improve script caching performance (#191, contributed by @mesejo)

Changed minimum elasticsearch Python library version to v7.7.0 (#207)

Changed name of Index.field_name to Index.es_field_name (#208)

Fixed

Fixed DeprecationWarning raised from pandas.Series when an an empty series was created without specifying dtype (#188, contributed by @mesejo)

Fixed a bug when filtering columns on complex combinations of and and or (#204)

Fixed an issue where DataFrame.shape would return a larger value than in the index if a sized operation like .head(X) was applied to the data frame (#205, contributed by @mesejo)

Fixed issue where both scikit-learn and xgboost libraries were required to use eland.ml.ImportedMLModel, now only one library is required to use this feature (#206)

Source code(tar.gz)
Source code(zip)
eland-7.7.0a1-py3-none-any.whl(137.93 KB)
eland-7.7.0a1.tar.gz(99.58 KB)
7.6.0a5(Aug 12, 2020)
7.6.0a5 (2020-04-14)

Added

Added support for Pandas v1.0.0 (#141, contributed by @mesejo)

Added use_pandas_index_for_es_ids parameter to pandas_to_eland() (#154)

Added es_type_overrides parameter to pandas_to_eland() (#181)

Added NDFrame.var(), .std() and .median() aggregations (#175, #176, contributed by @mesejo)

Added DataFrame.es_query() to allow modifying ES queries directly (#156)

Added eland.__version__ (#153, contributed by @mesejo)

Changed

Changed ML model serialization to be slightly smaller (#159)

Changed minimum elasticsearch Python library version to v7.6.0 (#181)

Fixed

Fixed inference_config being required on ML models for ES >=7.8 (#174)

Fixed unpacking for DataFrame.aggregate("median") (#161)

Removed

Removed support for Python 3.5 (#150)

Removed eland.Client() interface, use elasticsearch.Elasticsearch() client instead (#166)

Removed all private objects from top-level eland namespace (#170)

Removed geo_points from pandas_to_eland() in favor of es_type_overrides (#181)

"""
Source code(tar.gz)
Source code(zip)
eland-7.6.0a5-py3-none-any.whl(143.67 KB)
eland-7.6.0a5.tar.gz(91.15 KB)
7.6.0a4(Aug 12, 2020)
7.6.0a4 (2020-03-23)

Fixed

Fixed issue in DataFrame.info() when called on an empty frame (#135)

Fixed issues where many _source fields would generate a too_long_frame error (#135, #137)

Changed

Changed requirement for xgboost from >=0.90 to ==0.90

Source code(tar.gz)
Source code(zip)
eland-7.6.0a4-py3-none-any.whl(139.21 KB)
eland-7.6.0a4.tar.gz(87.68 KB)

Owner

elastic

GitHub https://eland.readthedocs.io

Es-schema - Common Data Schemas for Elasticsearch

Common Data Schemas for Elasticsearch The Common Data Schema for Elasticsearch i

2 Jan 25, 2022

A real-time tech course finder, created using Elasticsearch, Python, React+Redux, Docker, and Kubernetes.

130 Dec 20, 2022

esguard provides a Python decorator that waits for processing while monitoring the load of Elasticsearch.

esguard esguard provides a Python decorator that waits for processing while monitoring the load of Elasticsearch. Quick Start You need to launch elast

5 Dec 8, 2021

A library for fast import of Windows NT Registry(REGF) into Elasticsearch.

3 Apr 1, 2022

A library for fast parse & import of Windows Prefetch into Elasticsearch.

prefetch2es Fast import of Windows Prefetch(.pf) into Elasticsearch. prefetch2es uses C library libscca. Usage When using from the commandline interfa

5 Nov 24, 2022

Pysolr — Python Solr client

pysolr pysolr is a lightweight Python client for Apache Solr. It provides an interface that queries the server and returns results based on the query.

626 Dec 1, 2022

Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.

Auto-Complete Google Project In this project there is an implementation for one feature of Google's search engines - AutoComplete. Autocomplete, or wo

10 Jun 20, 2022

Senginta is All in one Search Engine Scrapper for used by API or Python Module. It's Free!

Senginta is All in one Search Engine Scrapper. With traditional scrapping, Senginta can be powerful to get result from any Search Engine, and convert to Json. Now support only for Google Product Search Engine (GShop, GVideo and many too) and Baidu Search Engine.

33 Nov 21, 2022

Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch with a familiar Pandas-compatible API.

Related tags

Overview

About

Getting Started

Compatibility

Connecting to Elasticsearch

DataFrames in Eland

Machine Learning in Eland

Comments

Releases(v8.3.0)

v8.3.0(Jul 11, 2022)

Added

Changed

Fixed

v8.2.0(May 11, 2022)

Added

v8.1.0(Mar 31, 2022)

Added

v8.0.0(Feb 10, 2022)

Added

Changed

v8.0.0b1(Dec 16, 2021)

Added

Changed

v7.14.1b1(Aug 30, 2021)

Added

Performance

v7.14.0b1(Aug 9, 2021)

Added

Removed

Changed

v7.13.0b1(Jun 22, 2021)

Added

Changed

Fixed

7.10.1b1(Jan 12, 2021)

Added

Removed

7.10.0b1(Oct 29, 2020)

Added

Changed

Fixed

7.9.1a1(Sep 30, 2020)

Added

Deprecated

Changed

Fixed

7.9.0a1(Aug 18, 2020)

7.9.0a1 (2020-08-18)

Added

Deprecated

Changed

Fixed

7.7.0a1(Aug 12, 2020)

7.7.0a1 (2020-05-20)

Added

Deprecated

Changed

Fixed

7.6.0a5(Aug 12, 2020)

7.6.0a5 (2020-04-14)

Added

Changed

Fixed

Removed

7.6.0a4(Aug 12, 2020)

7.6.0a4 (2020-03-23)

Fixed

Changed

Owner

elastic

Es-schema - Common Data Schemas for Elasticsearch

A real-time tech course finder, created using Elasticsearch, Python, React+Redux, Docker, and Kubernetes.

esguard provides a Python decorator that waits for processing while monitoring the load of Elasticsearch.

A library for fast import of Windows NT Registry(REGF) into Elasticsearch.

A library for fast parse & import of Windows Prefetch into Elasticsearch.

Pysolr — Python Solr client

Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.

Senginta is All in one Search Engine Scrapper for used by API or Python Module. It's Free!