Galvanalyser is a system for automatically storing data generated by battery cycling machines in a database

Overview

Galvanalyser is a system for automatically storing data generated by battery cycling machines in a database, using a set of "harvesters", whose job it is to monitor the datafiles produced by the battery testers and upload it in a standard format to the server database. The server database is a relational database that stores each dataset along with information about column types, units, and other relevant metadata (e.g. cell information, owner, purpose of the experiment)

There are two user interfaces to the system:

  • a web app front-end that can be used to view the stored datasets, manage the harvesters, and input metadata for each dataset
  • a REST API which can be used to download dataset metadata and the data itself. This API conforms to the battery-api OpenAPI specification, so tools based on this specification (e.g. the Python client) can use the API.

A diagram of the logical structure of the system is shown below. The arrows indicate the direction of data flow. The logical relationship of the various Galvanalyser components

Project documentation

The documentation directory contains more detailed documentation on a number of topics. It contains the following items:

  • FirstTimeQuickSetup.md - A quick start guide to setting up your first complete Galvanalyser system
  • AdministrationGuide.md - A guide to performing administration tasks such as creating users and setting up harvesters
  • DevelopmentGuide.md - A guide for developers on Galvanalyser
  • ProjectStructure.md - An overview of the project folder structure to guide developers to the locations of the various parts of the project

Technology used

This section provides a brief overview of the technology used to implement the different parts of the project.

Docker

Dockerfiles are provided to run all components of this project in containers. A docker-compose file exists to simplify starting the complete server side system including the database, the web app and the Nginx server. All components of the project can be run natively, however using Docker simplifies this greatly.

A Docker container is also used for building the web app and its dependencies to simplify cross platform deployment and ensure a consistent and reliable build process.

Backend server

The server is a Flask web application, which uses SQLAlchemy and psycopg2 to interface with the Postgres database.

Harvesters

The harvesters are python modules in the backend server which monitor directories for tester datafiles, parse them according to the their format and write the data and any metadata into the Postgres database. The running of the harvesters, either periodically or manually by a user, is done using a Celery distributed task queue.

Frontend web application

The frontend is written using Javascript, the React framework and using Material-UI components.

Database

The project uses PostgreSQL for its database. Other databases are currently not supported. An entity relationship diagram is shown below. Galvanalyser entity relationship diagram

Comments
  • docker-compose fails to build frontend after clean install

    docker-compose fails to build frontend after clean install

    Describe the bug

    Building galvanalyser_frontend
    [+] Building 5.3s (14/16)
    => ERROR [build 7/7] RUN npm run build 3.9s

    [build 7/7] RUN npm run build:
    #14 0.768
    #14 0.768 > [email protected] build
    #14 0.768 > react-scripts build
    #14 0.768
    #14 3.085 Creating an optimized production build...
    #14 3.762 Error: error:0308010C:digital envelope routines::unsupported

    To Reproduce after clean install change directories in .env to match personal setup. docker-compose build

    Expected behavior Build to succeed

    I believe it's similar to https://github.com/webpack/webpack/issues/14532 Adding ENV NODE_OPTIONS=--openssl-legacy-provider to Dockerfile made it work for me locally

    bug 
    opened by pghege 2
  • Matlab

    Matlab

    MATLAB code is not completely equal to Python code (it doesn't do column selection but does allow for extending to multiple datasets). That can be changed if we like, I suppose!

    opened by mjaquiery 1
  • Hi Martin, Could you please add a feature for API code generation in MATLAB (in addition to the existing Python). Thank you!

    Hi Martin, Could you please add a feature for API code generation in MATLAB (in addition to the existing Python). Thank you!

    Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

    Describe the solution you'd like A clear and concise description of what you want to happen.

    Additional context Add any other context or screenshots about the feature request here.

    enhancement 
    opened by emmanuellehagopian 1
  • Harvester Failed Import

    Harvester Failed Import

    Whenever a file fails to import it seems from the worker log to be ignored the next time the harvester is run, leading to the creation for multiple harvesters to monitor the same file path.

    1. Go to Harvester
    2. Create Harvester
    3. Import a file with bad formatting causing a fail
    4. Try and rerun the harvester with the same file after fixing file formatting

    Expected behavior Ideally the harvester would notice changes to the file size and rerun the import process.

    bug 
    opened by asafdari-boop 1
  • Harvester Deletion Button

    Harvester Deletion Button

    The delete harvester button does not appear to work. The React component row for that harvester stays on the webpage and I think you can still run it, meaning it may not actually be deleted.

    1. Got to Harvester
    2. Create Harvester
    3. Press Delete

    Expected behavior I expect the harvester to disappear along with its datapath.

    bug 
    opened by asafdari-boop 1
  • Yarn vs NPM

    Yarn vs NPM

    Is your feature request related to a problem? Please describe. I notice some modules are using yarn while others are using npm. This discrepancy may cause build failure in the future. Under galvanalyser frontend uses yarn in Dockerfile_dev but npm in Dockerfile. Additionally there is a yarn.lock file present in the directory which suggests Yarn is preferred. Should the prod Dockerfile be changed to use yarn?

    Describe the solution you'd like To rely on one build tool. I have slight preference for Yarn but just looking for clarity and consistency

    enhancement 
    opened by pghege 1
  • incompatible licence for maccor_input_file.py

    incompatible licence for maccor_input_file.py

    This file uses an incompatible license (copied below) to the BSD2 license used by Galavanalyser as a whole, need to resolve this

    @LuAPi, you are the primary author of this, are you happy to change the license on this file to BSD2?

    #!/usr/bin/env python
    
    # =========================== MRG Copyright Header ===========================
    #
    # Copyright (c) 2003-2019 University of Oxford. All rights reserved.
    # Authors: Mobile Robotics Group, University of Oxford
    #          http://mrg.robots.ox.ac.uk
    #
    # This file is the property of the University of Oxford.
    # Redistribution and use in source and binary forms, with or without
    # modification, is not permitted without an explicit licensing agreement
    # (research or commercial). No warranty, explicit or implicit, provided.
    #
    # =========================== MRG Copyright Header ===========================
    #
    # @author Luke Pitt.
    #
    
    opened by martinjrobins 0
  • Dependencies

    Dependencies

    Updated dependencies (react, MUI) and switched docker-compose.dev.yml workflow to a docker-compose.override.yml workflow. For production deployment the override file can be removed.

    opened by mjaquiery 0
  • support maccor binary files

    support maccor binary files

    Is your feature request related to a problem? Please describe.

    Would be great to be able to import maccor binary files as the test is progressing

    Describe the solution you'd like

    maccor parser currently supports text versions of maccor output files, need to extend to handle binary files

    Additional context

    blocker: currently can't find any information about the maccor binary file format or programs/libraries for import

    enhancement 
    opened by martinjrobins 6
  • It would be good to be able to search the datasets by battery cell parameter (e.g. anode chemistry, cathode chemistry). Thank you :)

    It would be good to be able to search the datasets by battery cell parameter (e.g. anode chemistry, cathode chemistry). Thank you :)

    The Problem: The cell data includes 7 parameters per battery name. Currently, you can only filter the datasets by cell name, which shows up as a column in datasets. You can not search datasets by battery cell parameters (e.g. cathode chemistry). This would be useful for comparing tests with common parameters.

    The Solution: It would be useful to filter the datasets for the 7 parameters not present as columns in the datasets table (as adding the 7 parameters in the datasets table would make the table too busy). E.g. if I wanted to compare LFP cells, I could filter and see all the comparable datasets with this common battery cell parameter.

    enhancement 
    opened by emmanuellehagopian 2
  • we need to find a better way to enter cell UIDs when entering metadata for a large number of cells from the same family

    we need to find a better way to enter cell UIDs when entering metadata for a large number of cells from the same family

    The problem At the moment all details in the "cells" table need to be entered separately for every cell, even if its the same make/size/type etc. as others. This is super tedious and discourages people from entering metadata.

    Possible solutions One solution (maybe a quick fix) is to add a "copy" button so you can duplicate the details of another row already in the cell table but just change the cell UID. A possibly better solution would be to create a new table of "cell families" (or similar) and then the cell table would just require users to enter UIDs and then choose which cell family to associate the UID with (from a drop-down list). (Nb would need error checking to make sure the cell family already exists and if not take the user there first. Also do we check that UIDs are actually unique?? I hope so). Thinking about it, this second option is way better, because then you could compare data for all cells from a given family more easily.

    enhancement 
    opened by davidhowey 3
  • Incomplete Cell description breaks API import of data

    Incomplete Cell description breaks API import of data

    Describe the bug If you don't fill in all the descriptors for cell metadata when you make a new one, e.g. anode chemistry, the API will error when trying to import cycling data with that cell metadata attached

    To Reproduce Steps to reproduce the behavior:

    1. create a new cell metadata
    2. leave some of the options blank
    3. attach metadata to a cycler test
    4. import that cycler data using the python API
    5. API fails

    Expected behavior python will error when you run the import code

    bug 
    opened by adamL-D 6
  • put license preamble in all source files

    put license preamble in all source files

    # SPDX-License-Identifier: BSD-2-Clause
    # Copyright  (c) 2020-2022, The Chancellor, Masters and Scholars of the University
    # of Oxford, and the 'Galvanalyser' Developers. All rights reserved.
    
    opened by martinjrobins 0
Owner
Battery Intelligence Lab
Battery Intelligence Lab
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

null 2 Nov 20, 2021
A forecasting system dedicated to smart city data

smart-city-predictions System prognostyczny dedykowany dla danych inteligentnych miast Praca inżynierska realizowana przez Michała Stawikowskiego and

Kevin Lai 1 Nov 8, 2021
Common bioinformatics database construction

biodb Common bioinformatics database construction 1.taxonomy (Substance classification database) Download the database wget -c https://ftp.ncbi.nlm.ni

sy520 2 Jan 4, 2022
This creates a ohlc timeseries from downloaded CSV files from NSE India website and makes a SQLite database for your research.

NSE-timeseries-form-CSV-file-creator-and-SQL-appender- This creates a ohlc timeseries from downloaded CSV files from National Stock Exchange India (NS

PILLAI, Amal 1 Oct 2, 2022
Random dataframe and database table generator

Random database/dataframe generator Authored and maintained by Dr. Tirthajyoti Sarkar, Fremont, USA Introduction Often, beginners in SQL or data scien

Tirthajyoti Sarkar 249 Jan 8, 2023
CSV database for chihuahua (HUAHUA) blockchain transactions

super-fiesta Shamelessly ripped components from https://github.com/hodgerpodger/staketaxcsv - Thanks for doing all the hard work. This code does only

Arlene Macciaveli 1 Jan 7, 2022
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen 3.7k Jan 3, 2023
Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza

null 898 Jan 9, 2023
A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

SymPy 9.9k Dec 31, 2022
Evidence enables analysts to deliver a polished business intelligence system using SQL and markdown.

Evidence enables analysts to deliver a polished business intelligence system using SQL and markdown

null 915 Dec 26, 2022
PyPSA: Python for Power System Analysis

1 Python for Power System Analysis Contents 1 Python for Power System Analysis 1.1 About 1.2 Documentation 1.3 Functionality 1.4 Example scripts as Ju

null 758 Dec 30, 2022
PySpark bindings for H3, a hierarchical hexagonal geospatial indexing system

h3-pyspark: Uber's H3 Hexagonal Hierarchical Geospatial Indexing System in PySpark PySpark bindings for the H3 core library. For available functions,

Kevin Schaich 12 Dec 24, 2022
songplays datamart provide details about the musical taste of our customers and can help us to improve our recomendation system

Songplays User activity datamart The following document describes the model used to build the songplays datamart table and the respective ETL process.

Leandro Kellermann de Oliveira 1 Jul 13, 2021
🧪 Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

???? ??. The purpose of the panel-chemistry project is to make it really easy for you to do DATA ANALYSIS and build powerful DATA AND VIZ APPLICATIONS within the domain of Chemistry using using Python and HoloViz Panel.

Marc Skov Madsen 97 Dec 8, 2022
fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

DAGsHub 359 Dec 22, 2022
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.

Tuplex 791 Jan 4, 2023
A data parser for the internal syncing data format used by Fog of World.

A data parser for the internal syncing data format used by Fog of World. The parser is not designed to be a well-coded library with good performance, it is more like a demo for showing the data structure.

Zed(Zijun) Chen 40 Dec 12, 2022
Fancy data functions that will make your life as a data scientist easier.

WhiteBox Utilities Toolkit: Tools to make your life easier Fancy data functions that will make your life as a data scientist easier. Installing To ins

WhiteBox 3 Oct 3, 2022