Debugging, monitoring and visualization for Python Machine Learning and Data Science

Overview

Welcome to TensorWatch

TensorWatch is a debugging and visualization tool designed for data science, deep learning and reinforcement learning from Microsoft Research. It works in Jupyter Notebook to show real-time visualizations of your machine learning training and perform several other key analysis tasks for your models and data.

TensorWatch is designed to be flexible and extensible so you can also build your own custom visualizations, UIs, and dashboards. Besides traditional "what-you-see-is-what-you-log" approach, it also has a unique capability to execute arbitrary queries against your live ML training process, return a stream as a result of the query and view this stream using your choice of a visualizer (we call this Lazy Logging Mode).

TensorWatch is under heavy development with a goal of providing a platform for debugging machine learning in one easy to use, extensible, and hackable package.

TensorWatch in Jupyter Notebook

How to Get It

pip install tensorwatch

TensorWatch supports Python 3.x and is tested with PyTorch 0.4-1.x. Most features should also work with TensorFlow eager tensors. TensorWatch uses graphviz to create network diagrams and depending on your platform sometime you might need to manually install it.

How to Use It

Quick Start

Here's simple code that logs an integer and its square as a tuple every second to TensorWatch:

import tensorwatch as tw
import time

# streams will be stored in test.log file
w = tw.Watcher(filename='test.log')

# create a stream for logging
s = w.create_stream(name='metric1')

# generate Jupyter Notebook to view real-time streams
w.make_notebook()

for i in range(1000):
    # write x,y pair we want to log
    s.write((i, i*i))

    time.sleep(1)

When you run this code, you will notice a Jupyter Notebook file test.ipynb gets created in your script folder. From a command prompt type jupyter notebook and select test.ipynb. Choose Cell > Run all in the menu to see the real-time line graph as values get written in your script.

Here's the output you will see in Jupyter Notebook:

TensorWatch in Jupyter Notebook

To dive deeper into the various other features, please see Tutorials and notebooks.

How does this work?

When you write to a TensorWatch stream, the values get serialized and sent to a TCP/IP socket as well as the file you specified. From Jupyter Notebook, we load the previously logged values from the file and then listen to that TCP/IP socket for any future values. The visualizer listens to the stream and renders the values as they arrive.

Ok, so that's a very simplified description. The TensorWatch architecture is actually much more powerful. Almost everything in TensorWatch is a stream. Files, sockets, consoles and even visualizers are streams themselves. A cool thing about TensorWatch streams is that they can listen to any other streams. This allows TensorWatch to create a data flow graph. This means that a visualizer can listen to many streams simultaneously, each of which could be a file, a socket or some other stream. You can recursively extend this to build arbitrary data flow graphs. TensorWatch decouples streams from how they get stored and how they get visualized.

Visualizations

In the above example, the line graph is used as the default visualization. However, TensorWatch supports many other diagram types including histograms, pie charts, scatter charts, bar charts and 3D versions of many of these plots. You can log your data, specify the chart type you want and let TensorWatch take care of the rest.

One of the significant strengths of TensorWatch is the ability to combine, compose, and create custom visualizations effortlessly. For example, you can choose to visualize an arbitrary number of streams in the same plot. Or you can visualize the same stream in many different plots simultaneously. Or you can place an arbitrary set of visualizations side-by-side. You can even create your own custom visualization widget simply by creating a new Python class, implementing a few methods.

Comparing Results of Multiple Runs

Each TensorWatch stream may contain a metric of your choice. By default, TensorWatch saves all streams in a single file, but you could also choose to save each stream in separate files or not to save them at all (for example, sending streams over sockets or into the console directly, zero hit to disk!). Later you can open these streams and direct them to one or more visualizations. This design allows you to quickly compare the results from your different experiments in your choice of visualizations easily.

Training within Jupyter Notebook

Often you might prefer to do data analysis, ML training, and testing - all from within Jupyter Notebook instead of from a separate script. TensorWatch can help you do sophisticated, real-time visualizations effortlessly from code that is run within a Jupyter Notebook end-to-end.

Lazy Logging Mode

A unique feature in TensorWatch is the ability to query the live running process, retrieve the result of this query as a stream and direct this stream to your preferred visualization(s). You don't need to log any data beforehand. We call this new way of debugging and visualization a lazy logging mode.

For example, as seen below, we visualize input and output image pairs, sampled randomly during the training of an autoencoder on a fruits dataset. These images were not logged beforehand in the script. Instead, the user sends query as a Python lambda expression which results in a stream of images that gets displayed in the Jupyter Notebook:

TensorWatch in Jupyter Notebook

See Lazy Logging Tutorial.

Pre-Training and Post-Training Tasks

TensorWatch leverages several excellent libraries including hiddenlayer, torchstat, Visual Attribution to allow performing the usual debugging and analysis activities in one consistent package and interface.

For example, you can view the model graph with tensor shapes with a one-liner:

Model graph for Alexnet

You can view statistics for different layers such as flops, number of parameters, etc:

Model statistics for Alexnet

See notebook.

You can view the dataset in a lower dimensional space using techniques such as t-SNE:

t-SNE visualization for MNIST

See notebook.

Prediction Explanations

We wish to provide various tools for explaining predictions to help debugging models. Currently, we offer several explainers for convolutional networks, including Lime. For example, the following highlights the areas that cause the Resnet50 model to make a prediction for class 240 for the Imagenet dataset:

CNN prediction explanation

See notebook.

Tutorials

Paper

More technical details are available in TensorWatch paper (EICS 2019 Conference). Please cite this as:

@inproceedings{tensorwatch2019eics,
  author    = {Shital Shah and Roland Fernandez and Steven M. Drucker},
  title     = {A system for real-time interactive analysis of deep learning training},
  booktitle = {Proceedings of the {ACM} {SIGCHI} Symposium on Engineering Interactive
               Computing Systems, {EICS} 2019, Valencia, Spain, June 18-21, 2019},
  pages     = {16:1--16:6},
  year      = {2019},
  crossref  = {DBLP:conf/eics/2019},
  url       = {https://arxiv.org/abs/2001.01215},
  doi       = {10.1145/3319499.3328231},
  timestamp = {Fri, 31 May 2019 08:40:31 +0200},
  biburl    = {https://dblp.org/rec/bib/conf/eics/ShahFD19},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Contribute

We would love your contributions, feedback, questions, and feature requests! Please file a Github issue or send us a pull request. Please review the Microsoft Code of Conduct and learn more.

Contact

Join the TensorWatch group on Facebook to stay up to date or ask any questions.

Credits

TensorWatch utilizes several open source libraries for many of its features. These include: hiddenlayer, torchstat, Visual-Attribution, pyzmq, receptivefield, nbformat. Please see install_requires section in setup.py for upto date list.

License

This project is released under the MIT License. Please review the License file for more details.

Comments
  • Issue with draw model

    Issue with draw model

    Hello,

    I've just installed tensorwatch and try to reproduce the example :

    import tensorwatch as tw
    import torchvision.models
    
    alexnet_model = torchvision.models.alexnet()
    tw.draw_model(alexnet_model, [1, 3, 224, 224])
    

    and unfortunately I'm gettind the following error :

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    ~/anaconda3/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
        343             method = get_real_method(obj, self.print_method)
        344             if method is not None:
    --> 345                 return method()
        346             return None
        347         else:
    
    ~/anaconda3/lib/python3.6/site-packages/tensorwatch/model_graph/hiddenlayer/pytorch_draw_model.py in _repr_svg_(self)
         11     def _repr_svg_(self):
         12         """Allows Jupyter notebook to render the graph automatically."""
    ---> 13         return self.dot._repr_svg_()
         14     def save(self, filename, format="png"):
         15         # self.dot.format = format
    
    AttributeError: 'Dot' object has no attribute '_repr_svg_'
    

    My versions are:

    • Python 3.6.9
    • IPython 7.9.0
    • Pytorch 1, 1.2 and 1.3.1 (I have tried with these three versions but nothing changes.)

    I also try to run the given example on google colab and it raises the same error...

    Maybe the error comes from my version of IPython ?

    opened by mondeg0 7
  • Cannot install with pip because depends on package not hosted on PyPi

    Cannot install with pip because depends on package not hosted on PyPi

    > pip install tensorwatch
    Collecting tensorwatch
      Using cached tensorwatch-0.9.0.tar.gz (187 kB)
    ERROR: Packages installed from PyPI cannot depend on packages which are not also hosted on PyPI.
    tensorwatch depends on [email protected] git+https://github.com/sytelus/[email protected]#egg=pydot 
    

    However, I can install tensorwatch version 0.8.10 without issue. I believe the problem was introduced by https://github.com/microsoft/tensorwatch/commit/353567f2071b4c7a5fae5afebeea787523c59762

    I'm running ubuntu 16 with pip 20.0.2.

    opened by jkerfs 5
  • pip install issue:

    pip install issue: "SyntaxError: invalid syntax"

    Hello, it looks like a very cool tool!! I tried to install with pip and got the following issue:

    Collecting tensorwatch Using cached https://files.pythonhosted.org/packages/ce/f2/4885c7f5ddf06224fc1443bb998464755e542c34f9966de4e686b9f1e43e/tensorwatch-0.8.4.tar.gz Requirement already satisfied: matplotlib in /media/ophir/DATA1/software/anaconda3/envs/pytorch/lib/python3.5/site-packages (from tensorwatch) (2.2.2) Requirement already satisfied: numpy in /media/ophir/DATA1/software/anaconda3/envs/pytorch/lib/python3.5/site-packages (from tensorwatch) (1.14.2) Requirement already satisfied: pyzmq in /media/ophir/DATA1/software/anaconda3/envs/pytorch/lib/python3.5/site-packages (from tensorwatch) (17.1.2) Requirement already satisfied: plotly in /media/ophir/DATA1/software/anaconda3/envs/pytorch/lib/python3.5/site-packages (from tensorwatch) (3.3.0) Collecting torchstat (from tensorwatch) Using cached https://files.pythonhosted.org/packages/bc/fe/f483b907ca80c90f189cd892bb2ce7b2c256010b30314bbec4fc17d1b5f1/torchstat-0.0.7-py3-none-any.whl Collecting receptivefield (from tensorwatch) Using cached https://files.pythonhosted.org/packages/cd/2a/a140221d151e228c5995e34f9c60d1ffd756f8672ccfbce8efe5da780671/receptivefield-0.4.0.tar.gz ERROR: Complete output from command python setup.py egg_info: ERROR: Traceback (most recent call last): File "", line 1, in File "/media/ophir/DATA1/ilyan/tmp/pip-install-gb2l8gtq/receptivefield/setup.py", line 13 download_url=f'https://github.com/fornaxai/receptivefield/archive/{VERSION}.tar.gz', ^ SyntaxError: invalid syntax ---------------------------------------- ERROR: Command "python setup.py egg_info" failed with error code 1 in /media/ophir/DATA1/ilyan/tmp/pip-install-gb2l8gtq/receptivefield/

    does any one have a solution?

    Thanks Ophir

    install issue 
    opened by ophir91 3
  • pip package missing json file

    pip package missing json file

    I try to execute following cnn_pred_explain notebook on Colab. https://github.com/microsoft/tensorwatch/blob/master/notebooks/cnn_pred_explain.ipynb

    But I failed to execute it, because following error appeared.

    ---------------------------------------------------------------------------
    FileNotFoundError                         Traceback (most recent call last)
    <ipython-input-5-b08090dd95a6> in <module>()
         10 image_utils.show_image(img)
         11 probabilities = imagenet_utils.predict(model=model, images=[img])
    ---> 12 imagenet_utils.probabilities2classes(probabilities, topk=5)
         13 input_tensor = imagenet_utils.image2batch(img)
         14 prediction_tensor = pytorch_utils.int2tensor(239)
    
    2 frames
    /usr/local/lib/python3.6/dist-packages/tensorwatch/imagenet_utils.py in __init__(self, json_path)
         54         json_path = json_path or os.path.join(os.path.dirname(__file__), 'imagenet_class_index.json')
         55 
    ---> 56         with open(os.path.abspath(json_path), "r") as read_file:
         57             class_json = json.load(read_file)
         58             self._idx2label = [class_json[str(k)][1] for k in range(len(class_json))]
    
    FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.6/dist-packages/tensorwatch/imagenet_class_index.json'
    

    In my guess, python pip package misses json file inclusion.

    Reference A Simple Guide for Python Packaging https://medium.com/small-things-about-python/lets-talk-about-python-packaging-6d84b81f1bb5

    opened by sakaia 3
  • Draw_model error

    Draw_model error

    Dear Sir

    Thanks for the excellent work !

    However when I try this,

    alex_model = models.alexnet()
    tw.draw_model(alex_model, [1, 3, 224, 224])
    

    It returned 'Only output_size=[1, 1] is supported' error

    python 3.7 pytorch 1.01

    opened by Stephenfang51 2
  • Tutorial notebook: missing

    Tutorial notebook: missing "summary.show()" line

    In the "notebooks/simple_logging.ipynb notebook, the 2nd to last code cell creates a Visualizer called "summary", but it is never displayed. The line "summary.show()" should be added to fix this.

    opened by rfernand2 2
  • pip install tensorwatch: missing ipywidgets and sklearn

    pip install tensorwatch: missing ipywidgets and sklearn

    After doing "pip install tensorwatch", I tried to run "sum_log.py". It required me to manually pip install "ipywidgets" and "sklearn" - these should be included tensorwatch's setup.py dependencies.

    opened by rfernand2 2
  • Fix gradcam.py

    Fix gradcam.py

    I fixed issue(#73). Add new properties(self.handle_forward_hook and self.handle_backward_hook). and remove register_forward_hook() and register_backward_hook()(explain).

    opened by kikusui6192 1
  • Can't generate images

    Can't generate images

    import tensorwatch as tw import torchvision.models alexnet_model = torchvision.models.alexnet() tw.draw_model(alexnet_model, [1, 3, 224, 224])

    error:

    ModuleNotFoundError Traceback (most recent call last) ~/anaconda2/envs/pytorch1.0/lib/python3.6/site-packages/IPython/core/formatters.py in call(self, obj) 343 method = get_real_method(obj, self.print_method) 344 if method is not None: --> 345 return method() 346 return None 347 else:

    ~/anaconda2/envs/pytorch1.0/lib/python3.6/site-packages/tensorwatch/model_graph/hiddenlayer/graph.py in repr_svg(self) 391 def repr_svg(self): 392 """Allows Jupyter notebook to render the graph automatically.""" --> 393 return self.build_dot(self.orientation).repr_svg() 394 395 def save(self, path, format="pdf"):

    ~/anaconda2/envs/pytorch1.0/lib/python3.6/site-packages/tensorwatch/model_graph/hiddenlayer/graph.py in build_dot(self, orientation) 333 Returns a GraphViz Digraph object. 334 """ --> 335 from graphviz import Digraph 336 337 # Build GraphViz Digraph

    ModuleNotFoundError: No module named 'graphviz'

    <tensorwatch.model_graph.hiddenlayer.graph.Graph at 0x7fe7ce046898>

    Write it as follows, without error, but without image

    import tensorwatch as tw import torchvision.models alexnet_model = torchvision.models.alexnet() dd = tw.draw_model(alexnet_model, [1, 3, 224, 224]) print(dd)

    <tensorwatch.model_graph.hiddenlayer.graph.Graph object at 0x7fe7cdfc5c88>

    bug 
    opened by chl916185 1
  • raise RuntimeError(

    raise RuntimeError("ONNX symbolic expected a constant value in the trace")

    Hello, the model I am using is EfficientNet, the pytorch version is 1.0.1, python3.6, CUDA9.0, but I will report an error.

    model.py

    from __future__ import print_function
    import argparse
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torch.optim as optim
    from torchvision import datasets, transforms, models
    from efficientnet_pytorch import EfficientNet
    from efficientnet_pytorch import utils
    
    from torchsummary import summary
    from torchstat import stat
    from tensorboardX import SummaryWriter
    writer = SummaryWriter('log')
    
    import torch.onnx
    import tensorwatch as tw
    
    
    def train(args, model, device, train_loader, optimizer, epoch):
        model.train()
        for batch_idx, (data, target) in enumerate(train_loader):
            data, target = data.to(device), target.to(device)
            optimizer.zero_grad()
            output = model(data)
            output1 = torch.nn.functional.log_softmax(output, dim=1)
            loss = F.nll_loss(output1, target)
            #loss = F.l1_loss(output, target)
            loss.backward()
            optimizer.step()
    
            #new ynh
            #每10个batch画个点用于loss曲线
            if batch_idx % 10 == 0:
                niter = epoch * len(train_loader) + batch_idx
                writer.add_scalar('Train/Loss', loss.data, niter)
    
            if batch_idx % args.log_interval == 0:
                print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                    epoch, batch_idx * len(data), len(train_loader.dataset),
                           100. * batch_idx / len(train_loader), loss.item()))
    
    
    def test(args, model, device, test_loader, epoch):
        model.eval()
        test_loss = 0
        correct = 0
        with torch.no_grad():
            for data, target in test_loader:
                data, target = data.to(device), target.to(device)
                output = model(data)
                output1 = torch.nn.functional.log_softmax(output, dim=1)
                test_loss += F.nll_loss(output1, target, reduction='sum').item()  # sum up batch loss
                pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
                correct += pred.eq(target.view_as(pred)).sum().item()
    
        test_loss /= len(test_loader.dataset)
    
        # new ynh
        writer.add_scalar('Test/Accu', test_loss, epoch)
    
    
        print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
            test_loss, correct, len(test_loader.dataset),
            100. * correct / len(test_loader.dataset)))
    
    
    def main():
        # Training settings
        parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
        parser.add_argument('--batch-size', type=int, default=10, metavar='N',
                            help='input batch size for training (default: 64)')
        parser.add_argument('--test-batch-size', type=int, default=10, metavar='N',
                            help='input batch size for testing (default: 1000)')
        parser.add_argument('--epochs', type=int, default=10, metavar='N',
                            help='number of epochs to train (default: 10)')
        parser.add_argument('--lr', type=float, default=0.01, metavar='LR',
                            help='learning rate (default: 0.01)')
        parser.add_argument('--momentum', type=float, default=0.5, metavar='M',
                            help='SGD momentum (default: 0.5)')
        parser.add_argument('--no-cuda', action='store_true', default=False,
                            help='disables CUDA training')
        parser.add_argument('--seed', type=int, default=1, metavar='S',
                            help='random seed (default: 1)')
        parser.add_argument('--log-interval', type=int, default=10, metavar='N',
                            help='how many batches to wait before logging training status')
    
        parser.add_argument('--save-model', action='store_true', default=False,
                            help='For Saving the current Model')
        args = parser.parse_args()
        use_cuda = not args.no_cuda and torch.cuda.is_available()
    
        torch.manual_seed(args.seed)
    
        device = torch.device("cuda" if use_cuda else "cpu")
    
        kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
        train_loader = torch.utils.data.DataLoader(
            datasets.MNIST(root='./mnist', train=True,download=True,
                           transform=transforms.Compose([
                               transforms.Resize((224), interpolation=2),
                               transforms.Grayscale(3),
                               transforms.ToTensor(),
                           ])),
            batch_size=args.batch_size, shuffle=True, **kwargs)
        test_loader = torch.utils.data.DataLoader(
            datasets.MNIST(root='./mnist', train=False, transform=transforms.Compose([
                transforms.Resize((224), interpolation=2),
                transforms.Grayscale(3),
                transforms.ToTensor(),
                transforms.Normalize((0.1307,), (0.3081,))
            ])),
            batch_size=args.test_batch_size, shuffle=True, **kwargs)
    
        blocks_args, global_params = utils.get_model_params('efficientnet-b0', override_params=None)
        #model = EfficientNet.from_pretrained('efficientnet-b0').to(device)#.cuda()
        model = EfficientNet(blocks_args, global_params)#.to(device)  # .cuda()
    
        #dummy_input = torch.rand(1, 3, 224, 224)
        #writer.add_graph(model, (dummy_input,))
    
        #dummy_input = torch.randn(10, 3, 224, 224, device='cuda')
        #model = model.cuda()
        #model1 = models.alexnet(pretrained=True).cuda()
        #torch.onnx.export(model1, dummy_input, "efficientnet.onnx", verbose=True)
    
        #print(model)
        tw.draw_model(model, [1, 3, 224, 224])
    
        #stat(model, (3, 224, 224))
        model.to(device)
        #summary(model, (3, 224, 224))
    
        print("-------------------------------------------")
    
    
    
        optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)
    
        for epoch in range(1, args.epochs + 1):
            train(args, model, device, train_loader, optimizer, epoch)
            test(args, model, device, test_loader, epoch)
    
        if (args.save_model):
            torch.save(model.state_dict(), "mnist_cnn.pt")
    
        writer.close()
    
    
    if __name__ == '__main__':
        main()
    

    utils.py

    """
    This file contains helper functions for building the model and for loading model parameters.
    These helper functions are built to mirror those in the official TensorFlow implementation.
    """
    
    import re
    import math
    import collections
    import torch
    from torch import nn
    from torch.nn import functional as F
    from torch.utils import model_zoo
    
    
    ########################################################################
    ############### HELPERS FUNCTIONS FOR MODEL ARCHITECTURE ###############
    ########################################################################
    
    
    # Parameters for the entire model (stem, all blocks, and head)
    GlobalParams = collections.namedtuple('GlobalParams', [
        'batch_norm_momentum', 'batch_norm_epsilon', 'dropout_rate',
        'num_classes', 'width_coefficient', 'depth_coefficient',
        'depth_divisor', 'min_depth', 'drop_connect_rate',])
    
    
    # Parameters for an individual model block
    BlockArgs = collections.namedtuple('BlockArgs', [
        'kernel_size', 'num_repeat', 'input_filters', 'output_filters',
        'expand_ratio', 'id_skip', 'stride', 'se_ratio'])
    
    
    # Change namedtuple defaults
    GlobalParams.__new__.__defaults__ = (None,) * len(GlobalParams._fields)
    BlockArgs.__new__.__defaults__ = (None,) * len(BlockArgs._fields)
    
    
    def relu_fn(x):
        """ Swish activation function """
        return x * torch.sigmoid(x)
    
    
    def round_filters(filters, global_params):
        """ Calculate and round number of filters based on depth multiplier. """
        multiplier = global_params.width_coefficient
        if not multiplier:
            return filters
        divisor = global_params.depth_divisor
        min_depth = global_params.min_depth
        filters *= multiplier
        min_depth = min_depth or divisor
        new_filters = max(min_depth, int(filters + divisor / 2) // divisor * divisor)
        if new_filters < 0.9 * filters:  # prevent rounding by more than 10%
            new_filters += divisor
        return int(new_filters)
    
    
    def round_repeats(repeats, global_params):
        """ Round number of filters based on depth multiplier. """
        multiplier = global_params.depth_coefficient
        if not multiplier:
            return repeats
        return int(math.ceil(multiplier * repeats))
    
    
    def drop_connect(inputs, p, training):
        """ Drop connect. """
        if not training: return inputs
        batch_size = inputs.shape[0]
        keep_prob = 1 - p
        random_tensor = keep_prob
        random_tensor += torch.rand([batch_size, 1, 1, 1], dtype=inputs.dtype)  # uniform [0,1)
        binary_tensor = torch.floor(random_tensor)
        output = inputs / keep_prob * binary_tensor
        return output
    
    
    class Conv2dSamePadding(nn.Conv2d):
        """ 2D Convolutions like TensorFlow """
        def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1, groups=1, bias=True):
            super().__init__(in_channels, out_channels, kernel_size, stride, 0, dilation, groups, bias)
            self.stride = self.stride if len(self.stride) == 2 else [self.stride[0]]*2
    
        def forward(self, x):
            ih, iw = x.size()[-2:]
            kh, kw = self.weight.size()[-2:]
            sh, sw = self.stride
            oh, ow = math.ceil(ih / sh), math.ceil(iw / sw)
            pad_h = max((oh - 1) * self.stride[0] + (kh - 1) * self.dilation[0] + 1 - ih, 0)
            pad_w = max((ow - 1) * self.stride[1] + (kw - 1) * self.dilation[1] + 1 - iw, 0)
            if pad_h > 0 or pad_w > 0:
                #print("pad_h",x.shape[2],"pad_w",x.shape[3])
                x = F.pad(x, [pad_w//2, pad_w - pad_w//2, pad_h//2, pad_h - pad_h//2])
                #print("pad_h",x.shape[2],"pad_w",x.shape[3])
                #print("===========================")
            return F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)
    
    
    ########################################################################
    ############## HELPERS FUNCTIONS FOR LOADING MODEL PARAMS ##############
    ########################################################################
    
    
    def efficientnet_params(model_name):
        """ Map EfficientNet model name to parameter coefficients. """
        params_dict = {
            # Coefficients:   width,depth,res,dropout
            'efficientnet-b0': (1.0, 1.0, 224, 0.2),
            'efficientnet-b1': (1.0, 1.1, 240, 0.2),
            'efficientnet-b2': (1.1, 1.2, 260, 0.3),
            'efficientnet-b3': (1.2, 1.4, 300, 0.3),
            'efficientnet-b4': (1.4, 1.8, 380, 0.4),
            'efficientnet-b5': (1.6, 2.2, 456, 0.4),
            'efficientnet-b6': (1.8, 2.6, 528, 0.5),
            'efficientnet-b7': (2.0, 3.1, 600, 0.5),
        }
        return params_dict[model_name]
    
    
    class BlockDecoder(object):
        """ Block Decoder for readability, straight from the official TensorFlow repository """
    
        @staticmethod
        def _decode_block_string(block_string):
            """ Gets a block through a string notation of arguments. """
            assert isinstance(block_string, str)
    
            ops = block_string.split('_')
            options = {}
            for op in ops:
                splits = re.split(r'(\d.*)', op)
                if len(splits) >= 2:
                    key, value = splits[:2]
                    options[key] = value
    
            # Check stride
            assert (('s' in options and len(options['s']) == 1) or
                    (len(options['s']) == 2 and options['s'][0] == options['s'][1]))
    
            return BlockArgs(
                kernel_size=int(options['k']),
                num_repeat=int(options['r']),
                input_filters=int(options['i']),
                output_filters=int(options['o']),
                expand_ratio=int(options['e']),
                id_skip=('noskip' not in block_string),
                se_ratio=float(options['se']) if 'se' in options else None,
                stride=[int(options['s'][0])])
    
        @staticmethod
        def _encode_block_string(block):
            """Encodes a block to a string."""
            args = [
                'r%d' % block.num_repeat,
                'k%d' % block.kernel_size,
                's%d%d' % (block.strides[0], block.strides[1]),
                'e%s' % block.expand_ratio,
                'i%d' % block.input_filters,
                'o%d' % block.output_filters
            ]
            if 0 < block.se_ratio <= 1:
                args.append('se%s' % block.se_ratio)
            if block.id_skip is False:
                args.append('noskip')
            return '_'.join(args)
    
        @staticmethod
        def decode(string_list):
            """
            Decodes a list of string notations to specify blocks inside the network.
    
            :param string_list: a list of strings, each string is a notation of block
            :return: a list of BlockArgs namedtuples of block args
            """
            assert isinstance(string_list, list)
            blocks_args = []
            for block_string in string_list:
                blocks_args.append(BlockDecoder._decode_block_string(block_string))
            return blocks_args
    
        @staticmethod
        def encode(blocks_args):
            """
            Encodes a list of BlockArgs to a list of strings.
    
            :param blocks_args: a list of BlockArgs namedtuples of block args
            :return: a list of strings, each string is a notation of block
            """
            block_strings = []
            for block in blocks_args:
                block_strings.append(BlockDecoder._encode_block_string(block))
            return block_strings
    
    
    def efficientnet(width_coefficient=None, depth_coefficient=None,
                     dropout_rate=0.2, drop_connect_rate=0.2):
        """ Creates a efficientnet model. """
    
        blocks_args = [
            'r1_k3_s11_e1_i32_o16_se0.25', 'r2_k3_s22_e6_i16_o24_se0.25',
            'r2_k5_s22_e6_i24_o40_se0.25', 'r3_k3_s22_e6_i40_o80_se0.25',
            'r3_k5_s11_e6_i80_o112_se0.25', 'r4_k5_s22_e6_i112_o192_se0.25',
            'r1_k3_s11_e6_i192_o320_se0.25',
        ]
        blocks_args = BlockDecoder.decode(blocks_args)
    
        global_params = GlobalParams(
            batch_norm_momentum=0.99,
            batch_norm_epsilon=1e-3,
            dropout_rate=dropout_rate,
            drop_connect_rate=drop_connect_rate,
            # data_format='channels_last',  # removed, this is always true in PyTorch
            num_classes=10,
            width_coefficient=width_coefficient,
            depth_coefficient=depth_coefficient,
            depth_divisor=8,
            min_depth=None
        )
    
        return blocks_args, global_params
    
    
    def get_model_params(model_name, override_params):
        """ Get the block args and global params for a given model """
        if model_name.startswith('efficientnet'):
            w, d, _, p = efficientnet_params(model_name)
            # note: all models have drop connect rate = 0.2
            blocks_args, global_params = efficientnet(width_coefficient=w, depth_coefficient=d, dropout_rate=p)
        else:
            raise NotImplementedError('model name is not pre-defined: %s' % model_name)
        if override_params:
            # ValueError will be raised here if override_params has fields not included in global_params.
            global_params = global_params._replace(**override_params)
        return blocks_args, global_params
    
    
    url_map = {
        'efficientnet-b0': 'http://storage.googleapis.com/public-models/efficientnet-b0-08094119.pth',
        'efficientnet-b1': 'http://storage.googleapis.com/public-models/efficientnet-b1-dbc7070a.pth',
        'efficientnet-b2': 'http://storage.googleapis.com/public-models/efficientnet-b2-27687264.pth',
        'efficientnet-b3': 'http://storage.googleapis.com/public-models/efficientnet-b3-c8376fa2.pth',
    }
    
    def load_pretrained_weights(model, model_name):
        """ Loads pretrained weights, and downloads if loading for the first time. """
        state_dict = model_zoo.load_url(url_map[model_name])
    
        pretrained_dict = {k: v for k, v in state_dict.items() if k != "_fc.weight" and k != "_fc.bias"}
        model.state_dict().update(pretrained_dict)
        model.load_state_dict(model.state_dict())
    
        print('Loaded pretrained weights for {}'.format(model_name))
    
    
    opened by yangninghua 1
  • read values from files

    read values from files

    the README states that the results of multiple runs (stored in log files) can be compared but it's not clear to me how that works. It seems that the WatcherClient only operates on the current stream and ignores values that have been written into a file earlier. How can I plot the values from a log file that is not being streamed to?

    opened by mschrimpf 1
  • AttributeError: 'torch._C.Node' object has no attribute 'ival'

    AttributeError: 'torch._C.Node' object has no attribute 'ival'

    Read This First

    • Make sure to describe all the steps to reproduce the issue
    • Include full error message in the description
    • Add OS version, Python version, Pytorch version if applicable

    Remember: if we cannot reproduce your problem, we cannot find solution!

    ### OS Version=win10 64bit python Version= 3.9.7 Pytorch Version= 1.10.2 tensorwatch=0.9.1

    ############################################################################################### ‘’‘ tw.draw_model(model, [1, 3, 512, 512],png_filename='unet.png') File "C:\Anaconda3\envs\pytorch1d10d2\lib\site-packages\tensorwatch_init_.py", line 35, in draw_model g = pytorch_draw_model.draw_graph(model, input_shape) File "C:\Anaconda3\envs\pytorch1d10d2\lib\site-packages\tensorwatch\model_graph\hiddenlayer\pytorch_draw_model.py", line 35, in draw_graph
    dot = draw_img_classifier(model, args) File "C:\Anaconda3\envs\pytorch1d10d2\lib\site-packages\tensorwatch\model_graph\hiddenlayer\pytorch_draw_model.py", line 63, in draw_img_classifier g = SummaryGraph(non_para_model, dummy_input) File "C:\Anaconda3\envs\pytorch1d10d2\lib\site-packages\tensorwatch\model_graph\hiddenlayer\summary_graph.py", line 221, in init new_op['attrs'] = OrderedDict([(attr_name, node[attr_name]) for attr_name in node.attributeNames()]) File "C:\Anaconda3\envs\pytorch1d10d2\lib\site-packages\tensorwatch\model_graph\hiddenlayer\summary_graph.py", line 221, in
    new_op['attrs'] = OrderedDict([(attr_name, node[attr_name]) for attr_name in node.attributeNames()]) File "C:\Anaconda3\envs\pytorch1d10d2\lib\site-packages\torch\onnx\utils.py", line 1232, in _node_getitem return getattr(self, sel)(k) AttributeError: 'torch._C.Node' object has no attribute 'ival'

    ’‘’ ###############################################################################################

    What's better than filing issue? Filing a pull request :).

    ------------------------------------ (Remove above before filing the issue) ------------------------------------

    opened by strongdiamond 0
  • AttributeError: 'torch._C.Node' object has no attribute 'ival'

    AttributeError: 'torch._C.Node' object has no attribute 'ival'

    from torchvision.models.resnet import resnet50 import tensorwatch as tw model = resnet50() tw.draw_model(model, [1,3,512,512])

    when using tensorwatch and jupyter to watch pytorch models as above codes show, report error as below: module 'torch.onnx' has no attribute 'set_training'

    then modify 'set_training' in /anaconda3/lib/python3.9/site-packages/tensorwatch/model_graph/hiddenlayer/summary_graph.py to 'select_model_mode_for_export', but report another error as below: 'torch._C.Node' object has no attribute 'ival'

    related versions: pytorch 1.10.1 tensorwatch 0.9.1

    opened by bigcatMT 0
  • import tensorwatch error

    import tensorwatch error

    import tensorwatch as tw

    `

    Connected to pydev debugger (build 211.7142.13) Traceback (most recent call last): File "", line 971, in _find_and_load File "", line 955, in _find_and_load_unlocked File "", line 665, in _load_unlocked File "", line 678, in exec_module File "", line 219, in _call_with_frames_removed File "/home/leef_wsl_u18/miniconda3/envs/py36/lib/python3.6/site-packages/tensorwatch/init.py", line 10, in from .text_vis import TextVis File "/home/leef_wsl_u18/miniconda3/envs/py36/lib/python3.6/site-packages/tensorwatch/text_vis.py", line 5, in from .vis_base import VisBase File "/home/leef_wsl_u18/miniconda3/envs/py36/lib/python3.6/site-packages/tensorwatch/vis_base.py", line 14, in class VisBase(Stream, metaclass=ABCMeta): File "/home/leef_wsl_u18/miniconda3/envs/py36/lib/python3.6/site-packages/tensorwatch/vis_base.py", line 16, in VisBase from IPython import get_ipython, display ImportError: cannot import name 'get_ipython' python-BaseException

    Process finished with exit code 1

    `

    opened by leaf918 0
  • The code for counting the duration is wrong.

    The code for counting the duration is wrong.

    I find two errors in the code for counting duration in file analyzer.py:

    1. In PyTorch, the execution of the program is asynchronous. If we use the following code to record the start and end time, the duration will be very short, because the end time is recorded without waiting for the GPU to complete the computation.

    https://github.com/microsoft/tensorwatch/blob/142f83a7cb8c54e47e9bab06cb3a1ef8ae225422/tensorwatch/model_graph/torchstat/analyzer.py#L96

    https://github.com/microsoft/tensorwatch/blob/142f83a7cb8c54e47e9bab06cb3a1ef8ae225422/tensorwatch/model_graph/torchstat/analyzer.py#L101

    1. If a module in CNN passes forward propagation multiple times, according to the following code, only the duration of the last forward propagation will be recorded, not the duration of each forward propagation.

    https://github.com/microsoft/tensorwatch/blob/142f83a7cb8c54e47e9bab06cb3a1ef8ae225422/tensorwatch/model_graph/torchstat/analyzer.py#L102

    Here is my solution:

    # tensorwatch\tensorwatch\model_graph\torchstat\analyzer.py
    class ModuleStats:
        def __init__(self, name) -> None:
            # self.duration = 0.0
            self.duration = []
    
    def _forward_pre_hook(module_stats:ModuleStats, module:nn.Module, input):
        assert not module_stats.done
        torch.cuda.synchronize()
        module_stats.start_time = time.time()
    
    def _forward_post_hook(module_stats:ModuleStats, module:nn.Module, input, output):
        assert not module_stats.done
        torch.cuda.synchronize()
        module_stats.end_time = time.time()
        # Using a list to store the duration of each forward propagation.
        # module_stats.duration = module_stats.end_time-module_stats.start_time
        module_stats.duration.append(module_stats.end_time - module_stats.start_time)
        # other code
    
    # tensorwatch\tensorwatch\model_graph\torchstat\stat_tree.py        
    class StatNode(object):
        def __init__(self, name=str(), parent=None):
            # self.duration = 0
            self._duration = []
            
        @property
        def duration(self):
            # total_duration = self._duration
            total_duration = sum(self._duration)
            for child in self.children:
                total_duration += child.duration
            return total_duration
            # or
            return self._duration
    

    I also provide a simple comparison result. In the Bottleneck of the ResNet backbone, the same relu function will be called three times, so there will be three corresponding durations. But in the TensorWatch statistics, we can only see one record of relu in the Bottleneck.

    https://github.com/open-mmlab/mmdetection/blob/f07de13b82b746dde558202f720ec2225f276d73/mmdet/models/backbones/resnet.py#L260-L299

    1

    But using my modified code, we can see that the duration of the three calls to the relu function are all recorded.

    2

    opened by Mrliduanyang 0
  • the save operation succeded but the notebook does not appear to be valid

    the save operation succeded but the notebook does not appear to be valid

    OS : windows 7 python3.6.8 running the demo: %matplotlib notebook import tensorwatch as tw client = tw.WatcherClient() loss_stream = client.create_stream(expr='lambda d:(d.iter, d.loss)') loss_plot = tw.Visualizer(loss_stream, vis_type='line', xtitle='Epoch', ytitle='Train Loss') loss_plot.show()

    notebook note: the save operation succeded but the notebook does not appear to be valid

    opened by xingha 0
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python

Petrel Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python. NOTE: The base Storm package provides storm.py, which

AirSage 247 Dec 18, 2021
Pebble is a stat's visualization tool, this will provide a skeleton to develop a monitoring tool.

Pebble is a stat's visualization tool, this will provide a skeleton to develop a monitoring tool.

Aravind Kumar G 2 Nov 17, 2021
This is a super simple visualization toolbox (script) for transformer attention visualization ✌

Trans_attention_vis This is a super simple visualization toolbox (script) for transformer attention visualization ✌ 1. How to prepare your attention m

Mingyu Wang 3 Jul 9, 2022
Apache Superset is a Data Visualization and Data Exploration Platform

Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Rele

The Apache Software Foundation 49.2k Nov 25, 2022
Apache Superset is a Data Visualization and Data Exploration Platform

Apache Superset is a Data Visualization and Data Exploration Platform

The Apache Software Foundation 49.1k Nov 20, 2022
Automatic data visualization in atom with the nteract data-explorer

Data Explorer Interactively explore your data directly in atom with hydrogen! The nteract data-explorer provides automatic data visualization, so you

Ben Russert 63 Nov 9, 2022
Data-FX is an addon for Blender (2.9) that allows for the visualization of data with different charts

Data-FX Data-FX is an addon for Blender (2.9) that allows for the visualization of data with different charts Currently, there are only 2 chart option

Landon Ferguson 20 Nov 21, 2022
Resources for teaching & learning practical data visualization with python.

Practical Data Visualization with Python Overview All views expressed on this site are my own and do not represent the opinions of any entity with whi

Paul Jeffries 98 Sep 24, 2022
These data visualizations were created for my introductory computer science course using Python

Homework 2: Matplotlib and Data Visualization Overview These data visualizations were created for my introductory computer science course using Python

Sophia Huang 12 Oct 20, 2022
Rick and Morty Data Visualization with python

Rick and Morty Data Visualization For this project I looked at data for the TV show Rick and Morty Number of Episodes at a Certain Location Here is th

null 7 Aug 29, 2022
Interactive Data Visualization in the browser, from Python

Bokeh is an interactive visualization library for modern web browsers. It provides elegant, concise construction of versatile graphics, and affords hi

Bokeh 16.9k Nov 19, 2022
Interactive Data Visualization in the browser, from Python

Bokeh is an interactive visualization library for modern web browsers. It provides elegant, concise construction of versatile graphics, and affords hi

Bokeh 14.7k Feb 13, 2021
Missing data visualization module for Python.

missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha

Aleksey Bilogur 3.4k Nov 15, 2022
Interactive Data Visualization in the browser, from Python

Bokeh is an interactive visualization library for modern web browsers. It provides elegant, concise construction of versatile graphics, and affords hi

Bokeh 14.7k Feb 18, 2021
Missing data visualization module for Python.

missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha

Aleksey Bilogur 2.6k Feb 18, 2021
High-level geospatial data visualization library for Python.

geoplot: geospatial data visualization geoplot is a high-level Python geospatial plotting library. It's an extension to cartopy and matplotlib which m

Aleksey Bilogur 1k Nov 20, 2022
Example Code Notebooks for Data Visualization in Python

This repository contains sample code scripts for creating awesome data visualizations from scratch using different python libraries (such as matplotli

Javed Ali 23 Oct 14, 2022
Exploratory analysis and data visualization of aircraft accidents and incidents in Brazil.

Exploring aircraft accidents in Brazil Occurrencies with aircraft in Brazil are investigated by the Center for Investigation and Prevention of Aircraf

Augusto Herrmann 5 Dec 14, 2021
Fast data visualization and GUI tools for scientific / engineering applications

PyQtGraph A pure-Python graphics library for PyQt5/PyQt6/PySide2/PySide6 Copyright 2020 Luke Campagnola, University of North Carolina at Chapel Hill h

pyqtgraph 3k Nov 15, 2022