The mitosheet package, trymito.io, and other public Mito code.

Overview

Mito Logo Mito Monorepo

Mito is a spreadsheet that lives inside your JupyterLab notebooks. It allows you to edit Pandas dataframes like an Excel file, and generates Python code that corresponds to each of your edits.

Mito aims to be the first tool in your data science toolkit and supports:

  • Point-and-click CSV and XLSX import
  • Excel-style pivot tables
  • Graph generation
  • Filtering and sorting
  • Merge (lookups)
  • Excel-Style formulas
  • Column summary statistics
  • And much more!

Mito is an open source tool (look around...), and will always be built by and for our community. See our plans page for more detail about our features, and consider purchasing Mito Pro to help fund development.

⚡️ Quick start

To get started, follow the install instructions here.

If you're interested in Mito Pro, see our plans page.

Documentation

You can find all Mito documentation available here.

Getting Help

To get support, join our Discord.

Docker Quick Start

Coming soon!

MyBinder

MyBinder link for the main branch: Binder

Contributing

This repo is the monorepo for the Mito project, and so contains the mitosheet package, the trymito.io website, and our documentation as well.

Mitosheet

To see the code for the mitosheet package, see the mitosheet folder.

Trymito.io

To see the code for our website, see the trymito.io folder.

Comments
  • Change import

    Change import

    Description

    This PR adds the update imports taskpane. This is the first step in the plan of action. The next steps are to add the error messaging, correct popups, and undo/redo/clear.

    The two main things to focus on in this review are:

    1. Because I very much wanted to reuse the existing import taskpanes so we can get error handling in them for free, we need to pass the updated import params around since we can't store state in the update import taskpane while the other taskpanes are open. This isn't so bad, but isn't ideal. It makes it hard, for example, to detect which imports have changed and which have not.
    2. The existing_import.py file. This is the method described in the spec, but there are more edge cases that I had to handle than initially expected (which is expected)
    3. And of course, whatever else stands out. I'm going to do more polishing, but want to get your review of the main architecture before I move forward.

    Testing

    Please provide a list of the ways you can "access" or use the functionality. Please try and be exhaustive here, and make sure that you test everything you list.

    • [ ] I have tested this on real data that is reasonable and large
    • [ ] If I changed the interaction with JupyterLab, I tested that it does not break other programs (like VS Code), and tested that it works "multiple times" in the same notebook.

    Documentation

    Note if any new documentation needs to addressed or reviewed.

    cla-signed 
    opened by aarondr77 50
  • mitosheet: remove all widget infra

    mitosheet: remove all widget infra

    Description

    This PR does a few cool things:

    1. Removes ipywidgets, jupyter_widgets entirely as dependencies. We no longer need these things.
    2. Changes notion of MitoWidget to MitoBackend -- this isn't a widget anymore!

    To do this, it has to do a ton of work we didn't expect to get the comms to properly handle be loaded, refreshed, recreated, etc.

    On differences between versions of Jupyter

    The main thing to call out here is that there are differences in how different versions of jupyter save the output JS that you put on the page:

    1. Jupyter Lab==3.4: saves the entire mitosheet, and rerenders the mitosheet when you refresh the page. This is best case.
    2. Jupyter Lab==3.0: saves some of the code, but not all of it. It doesn't appear to rerun rerun the re-render, so you have to rerun the mitosheet.sheet call to see the mitosheet
    3. Jupyter Notebook: It doesn't save any JS output. So you need to rerun the cell, to see the mitosheet again.

    I think these changes in problems on refresh are generally fine -- I don't think we have any ability to change them -- it seems to be behavior of what is saved.

    Iteration on this sort of change

    Some big lessons learned here:

    1. Don't get greedy and try and change other things (I went down a 1 day rabbit hole as a result of the buttons lol)
    2. We could do more explicit testing up-front of some of the integrations (like, actually using the comms in my tests would have been a good idea)

    However, it looks like we'll mostly hit the 2 weeks we aimed for with this product cycle, so I can't complain too much!

    Testing

    Test:

    1. Current JLab
    2. Old JLab
    3. Notebooks
    4. Windows
    5. On earlier versions of Lab back to ==3.0
    6. With multiple notebooks open at once
    7. With multiple mitosheets per page.
    8. Refreshing the page with mitosheet open
    9. Restarting the kernel then refreshing the page
    10. Restarting the kernel and not refreshing the page (TODO: how do we want to handle this?)

    Documentation

    No changes here required!

    opened by naterush 36
  • Graph fullsheet

    Graph fullsheet

    Description

    Makes graphing a full sheet experience. Each graph now is its own tab in Mito. The fact that the graph is an entire sheet and the fact that the graph is persistent makes graphing feel way more powerful, even without the upcoming functionality changes.

    It allows you to delete, duplicate, and rename graphs. It also lets you use generate a graph from a datasheet through the sheet tab actions.

    It also gets rid of the usage triggered feedback so we don't need to maintain that anymore.

    There are a few things that this PR leaves desired:

    Number 1: there is a tradeoff between making redo work properly when the entire graph is deleted, and making the graph refresh when opening the graph. The reason that this tradeoff exists is the same reason that we can't properly handle redo in pivot table (where we can't refresh the params). Namely, we can't tell the difference between a redo event and just a regular edit. See the existing issue here https://github.com/mito-ds/monorepo/issues/120

    Specifically, for graphing we can't tell the difference between opening a graph / switching from one graph to another (both of which cause the graphID in the GraphSidebar to change) and a redo event that recreates a graph that was previously completely removed using undo.

    Since the graph has existing issues with redo (namely, the parameters aren't kept in sync), and I think its a more common use case that people will create a graph, edit the data, and then want to see the updated graph (this is one of our core design principles of graphing -- respect the relationship between graphing and transformation), I vote we opt for refreshing the graph, causing redo to not work in this specific instance.

    Number 2: https://github.com/mito-ds/monorepo/issues/147

    Number 3: Since issues documented https://github.com/mito-ds/monorepo/issues/145, when you create a graph, it actually gets created as two different steps. Therefore, using Undo requires pressing it twice to get the graph to actually fully dissapear.

    Number 4: We should make the graph taskpane play nicer with the toolbar. However, I don't want to expand this PR to do so, so lets address it separetly. I added a note to the next spec to specify this in a bit more detail and add it to the plan of attack.

    Testing

    Make sure that it plays nicely with undoing graphs, and that the correct graph is always selected.

    Once we merge into dev, we can test the spacing on windows

    • [x] I have tested this on real data that is reasonable and large
    • [x] If I changed the interaction with JupyterLab, I tested that it does not break other programs (like VS Code), and tested that it works "multiple times" in the same notebook.

    Documentation

    Yes, this requires documentation changes, but we should just wait until we are finished this week of changes before updating.

    cla-signed 
    opened by aarondr77 32
  • Metaprogram 2.0

    Metaprogram 2.0

    Description

    This PR does a few things:

    1. Adds conditional formatting.
    2. Adds df import into the sheet. I think the implementation is pretty simple + sweet (from a UI perspective) - and will extend well to different types of import. Let me know thoughts.
    3. Overhauls the metaprogramming package to do a lot more useful things. We're going to keep investing in this, as it literally pays huge dividends!

    Things to figure out

    Duplicated Imports

    There's one thing to figure out. If the user imports the same dataframe twice (either in the mitosheet.sheet call, or through the new interface), the generated code gets out of sync of the sheet. Thoughts?

    We could just auto-copy the dataframe the case you import two of them - I'm fine w/ that. Let me know.

    Empty Sheet

    I'm thinking we could change the empty sheet so that it has two things:

    1. A button that lets you import files.
    2. Something that let's you easily click on a dataframe name to get it in the sheet immediately!

    Thoughts? I think it could be really cool.

    Testing

    Test df import. Test some interesting conditional formats. Some specific things to test with conditional formats is:

    1. Weird indexes in df
    2. Invalid filters
    3. Changing types after making the filters
    4. Etc. Put it in weird positions.

    Documentation

    We will need to write docs before deploying!

    cla-signed 
    opened by naterush 28
  • Bulk filter

    Bulk filter

    Description

    Implements the final version of the spec here: https://www.notion.so/trymito/Toggle-All-in-Values-bc5211451bbb4ce7844893f9b0d007a4

    The one change from the spec is the fact that the refresh button is always there. This is because there are many ways for this data to get out of date (e.g. editing it) - and always being able to refresh was really nice for me when I was testing!

    Testing

    Do some bulk filtering!

    Documentation

    We need to rewrite filter docs.

    cla-signed 
    opened by naterush 28
  • Sharing analyses

    Sharing analyses

    Description

    This PR implements this specification: https://www.notion.so/trymito/Sharing-Analyses-Specification-5acb830b2345454da6de8e71e15d25fd

    major differences:

    1. We store the author hash rather than the static_user id
    2. Write the cell metadata rather than notebook metadata
    3. Allow users to expand the steps to see what they look like before trusting them

    Testing

    There are a few things to test:

    1. Create an analysis, delete it off your computer in ~/.mito/saved_analyses, it should still replay
    2. Create an analysis, delete the cell below, it should still replay
    3. Create an analysis, delete from both locations, it should still display an error
    4. Create an analysis, change your static_user_id in user.json, and it should ask you to trust it before replaying (make sure to refresh kernel)
    5. Create multiple analyses in notebook
    6. Create analyses in Lab, move to Notebook, and vice versa
    7. Create analyses, change user.json and move one of the imported files. Should first ask you to trust, then open the PreReplayUpdateImports taskpane (and this should work).

    Any more tests you can think of as well!

    Documentation

    cla-signed 
    opened by naterush 22
  • Formatting v2

    Formatting v2

    Description

    A few differences to call out:

    1. The options in the Format dropdown are a bit different, but sensible.
    2. The default formatting is different in mito and outside mito (this is fine, I think... unless users want formatting that matches the default, what do?).
    3. Also, the default formatting now only displayed 2 decimals on floats, let me know what ya think.

    The fact that there are two different dropdowns for formatting is pretty dang weird, but I think we can iterate this once we discover it's confusing (which it is to me!).

    What is left

    There are two major things left to do, but I'm pausing here (b/c wrote enough code today...)

    • [ ] Write the docs
    • [ ] Write some real tests

    I will do both before merging, but it's ready to review now!

    Testing

    Test all the formats with all the precisions, and make sure the printed dataframes look the same!

    Documentation

    Need to write documentation for this!

    cla-signed 
    opened by naterush 21
  • Split text to columns

    Split text to columns

    Description

    Adds Split text to columns on delimiters. It also adds a bit of helpful infrastructure for putting buttons at the bottom of a taskpane.

    Future implementation:

    1. It would be nice if we could press the submit button with Enter. This infrastructure could be used by all of the useEditOnClick taskpanes. I opened an issue for it https://github.com/mito-ds/monorepo/issues/321
    2. Its a bit confusing that it doesn't work with formatting. We could take a stab at adding the unformatted view of the data to the preview have that that unformatted view displayed there when the taskpane opens. What do you think? @naterush

    There are a few known bugs in splitting text to columns:

    1. Using . along with another delimiter behaves weirdly. https://github.com/mito-ds/monorepo/issues/322.
    2. The split function sometimes returns None and other times returns nan, which doesn't get converted to NaN in the df_to_json_dumpsable function. https://github.com/mito-ds/monorepo/issues/323

    Testing

    Nothing special here.

    Documentation

    Checkout the documentation draft and merge when ready.

    cla-signed 
    opened by aarondr77 21
  • Copy and paste

    Copy and paste

    Description

    This PR adds copy and paste to Mito. A few notes about this:

    1. One of the major things I had to do was focus refactoring. Now, pretty much no matter where you click in Mito, you'll remain focused on the mito object. The major refactor here was fixing up when / how the dropdowns display.
    2. Since we have good focus handling now, check out the shortcuts I added to the menus... they are... kinda sicko mode.

    There are two main things to decide:

    1. How to handle \t values.
    2. How to handle NaN values.

    My thinking on (1) is that a) I don't know how often this occurs and b) I tried and failed to handle it for about 3 hours. I can spend more time here, but my concern is that b/c I don't really understand what I'm doing, I'm as likely to break things with this fix as I am to fix it. Same with 2. Let me know your thoughts.

    We'll get these fixed up (or decide not to) before we merge this, so this is kinda an incremental review rather than a final one!

    Testing

    Import data, do some pivoting or whatever, and copy and paste the data out! Try this with a few different datasets and all the different datatypes!

    Documentation

    Need to write documentation for copy and paste, probably!

    cla-signed 
    opened by naterush 18
  • Teams page

    Teams page

    Description

    Adds a team page and pro/enterprise roadmap to the plans page

    Testing

    Please provide a list of the ways you can "access" or use the functionality. Please try and be exhaustive here, and make sure that you test everything you list.

    • [ ] I have tested this on real data that is reasonable and large
    • [ ] If I changed the interaction with JupyterLab, I tested that it does not break other programs (like VS Code), and tested that it works "multiple times" in the same notebook.

    Documentation

    Note if any new documentation needs to addressed or reviewed.

    cla-signed 
    opened by aarondr77 17
  • mitosheet: add a csv config taskpane

    mitosheet: add a csv config taskpane

    Description

    Implements: https://www.notion.so/trymito/Import-Improvements-bb65727e1ba1479ea91ed176077fe589

    A few questions to really polish this off:

    1. Do we want to case on the types of errors to display a different message? I wonder how well we might be able to do here, or is it not worth it.

    Testing

    Add something that causes an error to the path where we try to guess the delimeter, to see if it automatically opens the config.

    Documentation

    We should update our import documentation!

    cla-signed 
    opened by naterush 15
  • [WIP] mitosheet: start snowflake import

    [WIP] mitosheet: start snowflake import

    Description

    Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context, and the specification if there is one.

    Testing

    Please provide a list of the ways you can "access" or use the functionality. Please try and be exhaustive here, and make sure that you test everything you list.

    • [ ] I have tested this on real data that is reasonable and large
    • [ ] If I changed the interaction with JupyterLab, I tested that it does not break other programs (like VS Code), and tested that it works "multiple times" in the same notebook.

    Documentation

    Note if any new documentation needs to addressed or reviewed.

    opened by naterush 1
  • Make it so you don't have to refresh kernel to develop

    Make it so you don't have to refresh kernel to develop

    Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

    Describe the solution you'd like A clear and concise description of what you want to happen.

    Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

    Additional context Add any other context or screenshots about the feature request here.

    opened by naterush 0
  • Auto-generated `import` statement improvement

    Auto-generated `import` statement improvement

    Is your feature request related to a problem? Please describe.

    The autogenerated code has a from mitosheet import * statement on top of the cell.

    When you copy and paste the code itself somewhere else it does not work because the code depends on that import, so you end up with situations like:

    NameError                                 Traceback (most recent call last)
    Cell In[11], line 16
    ---> 16 plotdf['stage-group'] = (to_int_series(mitosheet.RIGHT(plotdf['stage'], 1)) - 1) // 3
    
    NameError: name 'to_int_series' is not defined
    

    There is no neat solution, one has to do one of these:

    1. import the missing functions one by one
    2. keep the import * , which is a bad practice

    Describe the solution you'd like

    The autocode would be more easily reusable with one of the two options bellow

    1. import only the use functions explicitly
    from mitosheet import to_int_series  # for example
    
    1. import the module and call from it
    import mitoshee as mito  # or something else short
    # call the function from the module
    plotdf['stage-group'] = mito.to_int_series(...) // 3
    
    opened by jpcbertoldo 1
  • Filtering on datetime column gives temporary error when not typing quickly enough

    Filtering on datetime column gives temporary error when not typing quickly enough

    When a user tries to filter a date column and types the date in manually and not quickly enough, an execution error screen pops up even though the filter is applied properly when the users completes typing in the date correctly.

    Steps to reproduce:

    • Open any dataframe in mito that has a datetime column.
    • Add a filter to that datetime column and select >= from the dropdown menu.
    • Begin typing in a date starting with month and day (e.g., 10, 30)
    • When you get to the year, type 20, then pause for a moment and an error will appear.
    • Continue typing the year 2022 and see that the filter is applied to the column correctly.

    Expected behavior:

    • Either the error should not appear at all provided a valid year is entered within a reasonable amount of time, or the error window should disappear when the user enters a valid date so they don't have to close the error screen and/or wonder if the filter was properly applied.

    Actual behavior:

    • Execution error window appears and has to be manually closed.
    • Error message below

    image

    **Bonus bug within a bug :-) **

    Unable to copy the text from the full traceback (hence the screenshot of the partial message). When highlighting the text of the traceback, it appears to actually be copying some of the text in the column it was filtering on.

    opened by plasmonresonator 1
  • Download graph as html file: downloading extra graphs

    Download graph as html file: downloading extra graphs

    When I copied the code for one graph and pasted it, I ran it as html, it generated the iframe folder and populated that folder with the correct graph, however, it also put other graphs that existed in generated code from other mitosheets in that folder even though I only copied the show graph code for those.

    It also just put the graph within my working directory not in the iframe folder, so I had two copies.

    opened by aarondr77 0
Preview title and other information about links sent to chats.

Link Preview A small plugin for Nicotine+ to display preview information like title and description about links sent in chats. Plugin created with Nic

Nick 0 Sep 5, 2021
A python package to avoid writing and maintaining duplicated python docstrings.

docstring-inheritance is a python package to avoid writing and maintaining duplicated python docstrings.

Antoine Dechaume 15 Dec 7, 2022
python package sphinx template

python-package-sphinx-template python-package-sphinx-template

Soumil Nitin Shah 2 Dec 26, 2022
The sarge package provides a wrapper for subprocess which provides command pipeline functionality.

Overview The sarge package provides a wrapper for subprocess which provides command pipeline functionality. This package leverages subprocess to provi

Vinay Sajip 14 Dec 18, 2022
A python package to import files from an adjacent folder

EasyImports About EasyImports is a python package that allows users to easily access and import files from sister folders: f.ex: - Project - Folde

null 1 Jun 22, 2022
Pyoccur - Python package to operate on occurrences (duplicates) of elements in lists

pyoccur Python Occurrence Operations on Lists About Package A simple python package with 3 functions has_dup() get_dup() remove_dup() Currently the du

Ahamed Musthafa 6 Jan 7, 2023
This repo provides a package to automatically select a random seed based on ancient Chinese Xuanxue

?? Random Luck Deep learning is acturally the alchemy. This repo provides a package to automatically select a random seed based on ancient Chinese Xua

Tong Zhu(朱桐) 33 Jan 3, 2023
A Python Package To Generate Strong Passwords For You in Your Projects.

shPassGenerator Version 1.0.6 Ready To Use Developed by Shervin Badanara (shervinbdndev) on Github Language and technologies used in This Project Work

Shervin 11 Dec 19, 2022
Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

P3 Ranker Implementation for our SIGIR2022 accepted paper: P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-bas

null 14 Jan 4, 2023
A collection and example code of every topic you need to know about in the basics of Python.

The Python Beginners Guide: Master The Python Basics Tonight This guide is a collection of every topic you need to know about in the basics of Python.

Ahmed Baari 1 Dec 19, 2021
Show Rubygems description and annotate your code right from Sublime Text.

Gem Description for Sublime Text Show Rubygems description and annotate your code. Just mouse over your Gemfile's gem definitions to show the popup. s

Nando Vieira 2 Dec 19, 2022
Some code that takes a pipe-separated input and converts that into a table!

tablemaker A program that takes an input: a | b | c # With comments as well. e | f | g h | i |jk And converts it to a table: ┌───┬───┬────┐ │ a │ b │

CodingSoda 2 Aug 30, 2022
Some of the best ways and practices of doing code in Python!

Pythonicness ❤ This repository contains some of the best ways and practices of doing code in Python! Features Properly formatted codes (PEP 8) for bet

Samyak Jain 2 Jan 15, 2022
graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elliptical orbits. you can change timestamp value or scale from source code idc.

solarSystemOrbitalSimulation graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elli

Mega 3 Mar 3, 2022
Run `black` on python code blocks in documentation files

blacken-docs Run black on python code blocks in documentation files. install pip install blacken-docs usage blacken-docs provides a single executable

Anthony Sottile 460 Dec 23, 2022
The source code that powers readthedocs.org

Welcome to Read the Docs Purpose Read the Docs hosts documentation for the open source community. It supports Sphinx docs written with reStructuredTex

Read the Docs 7.4k Dec 25, 2022
Documentation of the QR code found on new Austrian ID cards.

Austrian ID Card QR Code This document aims to be a complete documentation of the format used in the QR area on the back of new Austrian ID cards (Per

Gabriel Huber 9 Dec 12, 2022
Automated generation of real Swagger/OpenAPI 2.0 schemas from Django REST Framework code.

drf-yasg - Yet another Swagger generator Generate real Swagger/OpenAPI 2.0 specifications from a Django Rest Framework API. Compatible with Django Res

Cristi Vîjdea 3k Dec 31, 2022
This is a repository for "100 days of code challenge" projects. You can reach all projects from beginner to professional which are written in Python.

100 Days of Code It's a challenge that aims to gain code practice and enhance programming knowledge. Day #1 Create a Band Name Generator It's actually

SelenNB 2 May 12, 2022