Linescanning - Package for (pre)processing of anatomical and (linescanning) fMRI data

Overview

line scanning repository

plot

This repository contains all of the tools used during the acquisition and postprocessing of line scanning data at the Spinoza Centre for Neuroimaging in Amsterdam. The script master controls the modules prefixed by spinoza_, which in turn call upon various scripts in the utils and bin directory. The scripts in the latter folders are mostly helper scripts to make life a tad easier. The repository contains a mix of languages in bash, python, and matlab.

In active development - do not use unless otherwise instructed by repo owners

Documentation for this package can be found at readthedocs (not up to date)

Policy & To Do

  • install using python setup.py develop
  • Docstrings in numpy format.
  • PEP8 - please set your editor to autopep8 on save!
  • Documentation with Sphinx (WIP)
  • Explore options to streamline code
  • Examples of applications for package (integration of pycortex & pRFpy)

overview of the pipeline

how to set up

Clone the repository: git clone https://github.com/gjheij/linescanning.git.

To setup the bash environment, edit setup file linescanning/shell/spinoza_setup:

  • line 76: add the path to your matlab installation if available (should be, for better anatomicall preprocessing)
  • line 87: add the path to your SPM installation
  • line 92: add your project name
  • line 97: add the path to project name as defined in line 92
  • line 102: add whether you're using (ME)MP(2)RAGE. This is required because the pipeline allows the usage of the average of an MP2RAGE and MP2RAGEME acquisition
  • line 105: add which type of data you're using (generally this will be the same as line 102)

Go to linescanning/shell and hit ./spinoza_setup setup setup. This will print a set of instructions that you need to follow. If all goes well this will make all the script executable, set all the paths, and install the python modules. The repository comes with a conda environment file, which can be activated with: conda create --name myenv --file environment.yml.

How to plan the line

plot

We currently aim to have two separate sessions: in the first session, we acquire high resolution anatomical scans and perform a population receptive field (pRF-) mapping paradigm (Dumoulin and Wandell, 2008) to delineate the visual field. After this session, we create surfaces of the brain and map the pRFs onto that via fMRIprep and pRFpy. We then select a certain vertex based on the parameters extracted from the pRF-mapping: eccentricity, size, and polar angle. Using these parameters, we can find an optimal vertex. We can obtain the vertex position, while by calculating the normal vector, we obtain the orientation that line should have (parellel to the normal vector and through the vertex point). Combining this information, we know how the line should be positioned in the first session anatomy. In the second session, we first acquire a low-resolution MP2RAGE with the volume coil. This is exported and registered to the first session anatomy during the second session to obtain the translations and rotations needed to map the line from the first session anatomy to the currently active second session by inputting the values in the MR-console. This procedure from registration to calculation of MR-console values is governed by spinoza_lineplanning and can be called with master -m 00 -s -h .

Comments
  • Usage of pipeline requires bash>4.0

    Usage of pipeline requires bash>4.0

    Some pieces of code will fail if the bash version is below 4.0. Could be made more explicit by specifying bash in front of the spinoza_* modules in the master script.

    Could also add a check for bash version in spinoza_setup during installation. Update docs too.

    opened by gjheij 9
  • Minor changes to be compatible with 3x MP2RAGE & T2

    Minor changes to be compatible with 3x MP2RAGE & T2

    master:

    if [[ $? -eq 0 ]]; then
    
      if [[ "${@}" == *"-q"* ]]; then
        ${job} spinoza_sinusfrommni
        exit 1
      fi
    
      echo "---------------------------------------------------------------------------------------------------"
      echo "[${module}] Create sagittal sinus mask using the MNI-mask & T1w/T2w-ratio"
    
      if [[ -d ${DIR_DATA_DERIV}/pymp2rage ]]; then
        INPUT_DIR=${DIR_DATA_DERIV}/pymp2rage
      else
        INPUT_DIR=${DIR_DATA_HOME}
      fi
    
      ${job} spinoza_sinusfrommni ${SUBJECT} ${SESSION} ${OW} ${INPUT_DIR} ${DIR_DATA_DERIV}/ants ${DIR_DATA_DERIV}/manual_masks
    
      if [[ $? -ne 0 ]]; then
        echo "ERROR in `basename ${0}`: spinoza_sinusfrommni exited with non-zero status"
        exit 1
      fi
    
      echo "[${module}] DONE" && echo
    
    fi
    

    spinoza_averageanatomies:

      t2_newspace=`find ${input_dir} -type f \( -name "*acq-3DTSE*" -and -name "*space*" \) 2>/dev/null`
      if [ ! -z ${t2_newspace} ]; then
        cp ${t2_newspace} ${output_dir}/${base}_acq-${DATA}_T2.nii.gz 2>/dev/null
      fi
    

    spinoza_brainextraction:

                echo "CAT12 was selected, using following parameters:"
                echo " -input  = `basename ${input}`"
                echo " -output = `basename ${brainmask}`"
                
                call_cat12 ${mode} -s ${SPM_PATH} ${input} ${BET}
    

    spinoza_registration:

        moving=`find "${input_dir}" -type f \( -iname "*${ACQ[1]^^}_*" -and -name "*.nii.gz" -and -not -name "*space-*" \) 2>/dev/null`
    

    spinoza_sinusfrommni:

                call_t12ratio --t1 ${t1} --t2 ${t2} -o ${pn_anat}/${base}_acq-${DATA^^}
    
    
    opened by carlottasabate 2
  • different affines for spm and fsl masks - module 04

    different affines for spm and fsl masks - module 04

    Hi,

    When running module 4 on the MP2RAGE of one of my participants I get the following error

    Combining SPM & FSL mask to include cerebellum too
    Traceback (most recent call last):
      File "/Users/verissimo/linescanning/bin/call_mp2rage", line 274, in <module>
        main(sys.argv[1:])
      File "/Users/verissimo/linescanning/bin/call_mp2rage", line 258, in main
        new_mask = image.math_img('(spm + fsl) > 0', spm=brainmask, fsl=opj(inputdir, 'tmp_mask.nii.gz'))
      File "/Users/verissimo/.local/lib/python3.7/site-packages/nilearn/image/image.py", line 907, in math_img
        _check_same_fov(*niimgs, raise_error=True)
      File "/Users/verissimo/.local/lib/python3.7/site-packages/nilearn/_utils/niimg_conversions.py", line 67, in _check_same_fov
        for e in errors]))
    ValueError: ("Input images cannot be compared, you provided 'dict_values(['/Volumes/Verissimo_Seagate_Hub_Plus/Projects/FAM/DATA/anat_preprocessing/derivatives/pymp2rage/sub-011/ses-1/sub-011_ses-1_acq-MP2RAGE_desc-spm_mask.nii.gz', '/Volumes/Verissimo_Seagate_Hub_Plus/Projects/FAM/DATA/anat_preprocessing/sub-011/ses-1/tmp_mask.nii.gz'])',", 'Following field of view errors were detected:\n- img_#0 and img_#1 do not have the same affine')
    

    When checking, indeed their affine is slighty different (probably due to some rounding error) img_1.affine

    array([[ 6.99586272e-01, -2.39259005e-02,  0.00000000e+00,
            -8.79358139e+01],
           [ 2.40633488e-02,  6.95588470e-01,  4.85897064e-04,
            -1.21713348e+02],
           [-1.68085098e-05, -4.85658646e-04,  6.95999801e-01,
            -1.34004532e+02],
           [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
             1.00000000e+00]])
    

    img_2affine

    array([[ 6.99586272e-01, -2.39258446e-02,  0.00000000e+00,
            -8.79358139e+01],
           [ 2.40633450e-02,  6.95588470e-01,  4.85899596e-04,
            -1.21713348e+02],
           [-1.67993858e-05, -4.85612400e-04,  6.95999801e-01,
            -1.34004532e+02],
           [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
             1.00000000e+00]])
    

    I didn't have this before, so am wondering why this only happened now... And which of the affines is the one that should be used throughout the pipeline?

    opened by iverissimo 2
  • input file name in call_stats2surf

    input file name in call_stats2surf

    "nii.gz" is by default part of the find-command, so if you input a file with nii.gz at the end it will look for a file with nii.gz.nii.gz

    could be fixed via sed 's,.nii.gz.nii.gz,.nii.gz,' or if loop

    opened by hlesaint 2
  • One class to rule them all; let ParseFuncFile inherit from ParseExpToolsFile

    One class to rule them all; let ParseFuncFile inherit from ParseExpToolsFile

    Now I have separate classes for formatting the functional data (ParseFuncFile), experimental data (ParseExpToolsFile), eyetracker data (ParseEyetrackerFile), and physio files (ParsePhysioFile). Might be good to have ParseFuncFile as base class, and have it inherit the other classes.

    Right now, ParseExpToolsFile already inherits ParseEyetrackerFile, so we'd only have to include the former into ParseFuncFile. I envision the inputs to be solely list of files corresponding to fMRI, exptools, eyetracker, and physio (maybe even just BIDS-like directory).

    And maybe split these classes into a separate dataset.py-module rather than clogging up utils.py

    opened by gjheij 2
  • melodic axis of power spectra

    melodic axis of power spectra

    can you change the limit of the x-axis in the power spectra here: /data1/projects/MicroFunc/Luisa/programs/linescanning/linescanning/preproc.py to

                # get frequency/power
                freq = get_freq(tc, TR=self.TR, spectrum_type="fft")
    
                plotting.LazyPlot(
                    freq[1],
                    xx=freq[0],
                    color=color,
                    x_label="frequency (Hz)",
                    y_label="power (a.u.)",
                    title="power spectra",
                    axs=ax_freq,
                    line_width=2,
                    x_lim=[0,1/(2*self.TR)],
                    **kwargs)
    
    

    So that we actually have the maximum frequency available Thanks :)

    opened by luisarai 1
  • missing warp-files for call_antsapplytransforms

    missing warp-files for call_antsapplytransforms

    When running spinoza_sinusfrommni got the missing file error even though the files are in the the ants folder. Noticed that lines 172 and 173 expected MNI152Nlin6Asym and files created in module 5 are called MNI152NLin6Asym (so capital letter). I guess it's related to the version of ANTs used, but have not checked to be sure

    opened by iverissimo 1
  • Check if bias correction of T2 improves pial recon

    Check if bias correction of T2 improves pial recon

    I just noticed that the T2's tend to have dropout right in the area we want it to properly work (i.e., visual areas). I'll check if bias correction prior to FreeSurfer make recon even better T2_dropout

    opened by gjheij 1
  • add image orientation check

    add image orientation check

    Realized that MP2RAGE I had was in PSR orientation (336x336x265) and code expects image to be RAS (265x336x336). This is only caught in module 13 in call_gdhmasking - when calculating 'outside' mask (line 248) from dilated dura mask and sinus drawn in ITKsnap, they will have different affines... Because I noticed this, I reoriented the two inversions niftis with a quick script

    orig_img = nb.load(input_file)
    canonical_img = nb.as_closest_canonical(orig_img)
    nb.save(canonical_img, input_file.replace('.nii.gz','RAS_.nii.gz'))
    

    and re-ran the pipeline. Then all was fine. Would be better to have some orientation check in the first modules (maybe in spinoza_qmrimaps?) to avoid this. Can be done by re-orienting the images in the pipeline, or throwing an error so the user knows the orientation should be RAS.

    opened by iverissimo 1
  • matlab alias fix in call_bashhelper

    matlab alias fix in call_bashhelper

    When calling matlab, current code assumes that there is an alias command matlab. Because I switched from zshell to bash, alias are not inherited and calling matlab would return command not found error message. My fix for this was to replace it with $MATLAB_DIR: $MATLAB_DIR -nosplash -nodisplay -batch "addpath('$(dirname ${1})'); $(basename ${1} .m)" | tail -n +${skip_lines} # suppress intro text This worked, not sure if best fix

    opened by iverissimo 1
  • custom stimulus duration input

    custom stimulus duration input

    def check_input_is_list(obj, var=None, list_element=0):
    
        if hasattr(obj, var):
            attr = getattr(obj, var)
        else:
            raise ValueError(f"Class does not have '{var}'-attribute")
    
        obj_attr = "func_file"
        if hasattr(obj, "tsv_file"):
            obj_attr = "tsv_file"
        
        if isinstance(attr, list) or isinstance(attr, np.ndarray):
            if len(attr) != len(getattr(obj,obj_attr)):
                raise ValueError(f"Length of '{var}' ({len(attr)}) does not match number of func files ({len(getattr(obj,obj_attr))}). Either specify a list of equal lenghts or 1 integer value for all volumes")
    
            return attr[list_element]
        else:
            return attr
    
    class ParseExpToolsFile(ParseEyetrackerFile):
    
        """ParseExpToolsFile()
    
        Class for parsing tsv-files created during experiments with Exptools2. The class will read in the file, read when the experiment actually started, correct onset times for this start time and time deleted because of removing the first few volumes (to do this correctly, set the `TR` and `deleted_first_timepoints`). You can also provide a numpy array/file containing eye blinks that should be added to the onset times in real-world time (seconds). In principle, it will return a pandas DataFrame indexed by subject and run that can be easily concatenated over runs. This function relies on the naming used when programming the experiment. In the `session.py` file, you should have created `phase_names=['iti', 'stim']`; the class will use these things to parse the file.
    
        Parameters
        ----------
        tsv_file: str, list
            path pointing to the output file of the experiment
        subject: int
            subject number in the returned pandas DataFrame (should start with 1, ..., n)
        run: int
            run number you'd like to have the onset times for
        button: bool
            boolean whether to include onset times of button responses (default is false). ['space'] will be ignored as response
        blinks: str, np.ndarray
            string or array containing the onset times of eye blinks as extracted with hedfpy
        TR: float
            repetition time to correct onset times for deleted volumes
        deleted_first_timepoints: int
            number of volumes to delete to correct onset times for deleted volumes. Can be specified for each individual run if `tsv_file` is a list
        use_bids: bool, optional
            If true, we'll read BIDS-components such as 'sub', 'run', 'task', etc from the input file and use those as indexers, rather than sequential 1,2,3.
        funcs: str, list, optional
            List of functional files that is being passed down down to :class:`linescanning.dataset.ParseEyetrackerFile`. Required for correct resampling to functional space
        edfs: str, list, optional
            List of eyetracking output files that is being passed down down to :class:`linescanning.dataset.ParseEyetrackerFile`.
        verbose: bool, optional
            Print details to the terminal, default is False
        phase_onset: int, optional
            Which phase of exptools-trial should be considered the actual stimulus trial. Usually, `phase_onset=0` means the interstimulus interval. Therefore, default = 1
        stim_duration: str, int, optional
            If desired, add stimulus duration to onset dataframe. Can be one of 'None', 'stim' (to use duration from exptools'  log file) or any given integer
        add_events: str, list, optional
            Add additional events to onset dataframe. Must be an existing column in the exptools log file. For intance, `responses` and `event_type = stim` are read in by default, but if we have a separate column containing the onset of some target (e.g., 'target_onset'), we can add these times to the dataframe with `add_events='target_onset'`.
        event_names: str, list, optional
            Custom names for manually added events through `add_events` if the column names are not the names you want to use in the dataframe. E.g., if I find `target_onset` too long of a name, I can specify `event_names='target'`. If `add_events` is a list, then `event_names` must be a list of equal length if custom names are desired. By default we'll take the names from `add_events`
        RTs: bool, optional
            If we have a design that required some response to a stimulus, we can request the reaction times. Default = False
        RT_relative_to: str, optional
            If `RTs=True`, we need to know relative to what time the button response should be offset. Only correct responses are considered, as there's a conditional statement that requires the present of the reference time (e.g., `target_onset`) and button response. If there's a response but no reference time, the reaction time cannot be calculated. If you do not have a separate reference time column, you can specify `RT_relative_to='start'` to calculate the reaction time relative to onset time. If `RT_relative_to != 'start'`, I'll assume you had a target in your experiment in X/n_trials. From this, we can calculate the accuracy and save that to `self.df_accuracy`, while reaction times are saved in`self.df_rts`
    
        Examples
        ----------
        >>> from linescanning.utils import ParseExpToolsFile
        >>> file = 'some/path/to/exptoolsfile.tsv'
        >>> parsed_file = ParseExpToolsFile(file, subject=1, run=1, button=True)
        >>> onsets = parsed_file.get_onset_df()
    
        >>> # If you want to get all your subjects and runs in 1 nideconv compatible dataframe, you can do something like this:
        >>> onsets = []
        >>> run_subjects = ['001','002','003']
        >>> for sub in run_subjects:
        >>>     path_tsv_files = os.path.join(f'some/path/sub-{sub}')
        >>>     f = os.listdir(path_tsv_files)
        >>>     nr_runs = []; [nr_runs.append(os.path.join(path_tsv_files, r)) for r in f if "events.tsv" in r]
        >>> 
        >>>     for run in range(1,len(nr_runs)+1):
        >>>         sub_idx = run_subjects.index(sub)+1
        >>>         onsets.append(ParseExpToolsFile(df_onsets, subject=sub_idx, run=run).get_onset_df())
        >>>         
        >>> onsets = pd.concat(onsets).set_index(['subject', 'run', 'event_type'])
        """
    
        def __init__(
            self, 
            tsv_file, 
            subject=1, 
            run=1, 
            button=False, 
            blinks=None, 
            RTs=False,
            RT_relative_to=None,
            TR=0.105, 
            deleted_first_timepoints=0, 
            edfs=None, 
            funcs=None, 
            use_bids=True,
            verbose=False,
            phase_onset=1,
            stim_duration=None,
            add_events=None,
            event_names=None,
            **kwargs):
    
            self.tsv_file                       = tsv_file
            self.sub                            = subject
            self.run                            = run
            self.TR                             = TR
            self.deleted_first_timepoints       = deleted_first_timepoints
            self.button                         = button
            self.blinks                         = blinks
            self.funcs                          = funcs
            self.edfs                           = edfs
            self.use_bids                       = use_bids
            self.verbose                        = verbose
            self.phase_onset                    = phase_onset
            self.stim_duration                  = stim_duration
            self.RTs                            = RTs
            self.RT_relative_to                 = RT_relative_to
            self.add_events                     = add_events
            self.event_names                    = event_names
            self.__dict__.update(kwargs)
    
            if self.edfs != None:
                super().__init__(
                    self.edfs, 
                    subject=self.sub, 
                    func_file=self.funcs, 
                    TR1=self.TR, 
                    use_bids=self.use_bids, 
                    verbose=self.verbose)
            else:
                self.include_blinks = False
    
            if self.verbose:
                print("\nEXPTOOLS")
    
            if isinstance(self.tsv_file, str):
                self.tsv_file = [self.tsv_file]
    
            if isinstance(self.tsv_file, list):
                df_onsets = []
                df_rts = []
                df_accuracy = []
                for run, onset_file in enumerate(self.tsv_file):
    
                    if self.use_bids:
                        bids_comps = utils.split_bids_components(onset_file)
                        for el in ['sub', 'run']:
                            setattr(self, el, bids_comps[el])
                    else:
                        self.run = run+1
    
                    # include eyeblinks?
                    if self.include_blinks:
                        self.blinks = self.fetch_blinks_run(run=self.run)
    
                    # check if we got different nr of vols to delete per run
                    delete_vols = check_input_is_list(self, "deleted_first_timepoints", list_element=run)
    
                    # check if we got different stimulus durations per run
                    duration = check_input_is_list(self, var="stim_duration", list_element=run)
    
                    # read in the exptools-file
                    self.preprocess_exptools_file(
                        onset_file, 
                        run=self.run, 
                        delete_vols=delete_vols, 
                        phase_onset=self.phase_onset,
                        duration=duration)
    
                    # append to df
                    df_onsets.append(self.get_onset_df(index=False))
    
                    # check if we got RTs
                    try:
                        df_rts.append(self.get_rts_df(index=False))
                    except:
                        pass
    
                    # check if we got accuracy (only if RT_relative_to != 'start')
                    try:
                        df_accuracy.append(self.get_accuracy(index=False))
                    except:
                        pass
    
    
                # concatemate df
                self.df_onsets = pd.concat(df_onsets).set_index(['subject', 'run', 'event_type'])
    
                # rts
                try:
                    self.df_rts = pd.concat(df_rts).set_index(['subject', 'run'])
                except:
                    pass
    
                # accuracy
                try:
                    self.df_accuracy = pd.concat(df_accuracy).set_index(['subject', 'run'])
                except:
                    pass
    
            # get events per run
            self.events_per_run = self.events_per_run()
    
    
        def events_per_run(self):
            n_runs = np.unique(self.df_onsets.reset_index()['run'].values)
            events = {}
            for run in n_runs:
                df = utils.select_from_df(self.df_onsets, expression=f"run = {run}", index=None)
                events[run] = np.unique(df['event_type'].values)
    
            return events
    
        def events_single_run(self, run=1):
            return self.events_per_run[run]
    
        def preprocess_exptools_file(self, tsv_file, run=1, delete_vols=0, phase_onset=1, duration=None):
            
            if self.verbose:
                print(f"Preprocessing {tsv_file}")
            with open(tsv_file) as f:
                self.data = pd.read_csv(f, delimiter='\t')
    
            # trim onsets to first 't'
            delete_time         = delete_vols*self.TR
            self.start_time     = float(self.data.loc[(self.data['event_type'] == "pulse") & (self.data['phase'] == 0)]['onset'].values[0])
            self.trimmed        = self.data.loc[(self.data['event_type'] == "stim") & (self.data['phase'] == phase_onset)].iloc[1:,:]
            self.onset_times    = self.trimmed['onset'].values[...,np.newaxis]
    
            skip_duration = False
            if isinstance(duration, float) or isinstance(duration, int):
                self.durations = np.full_like(self.onset_times, float(duration))
            elif duration == None:
                skip_duration = True
            else:
                self.durations = self.trimmed['duration'].values[...,np.newaxis]
            
            self.condition = self.trimmed['condition'].values[..., np.newaxis]
            if self.verbose:
                print(f" 1st 't' @{round(self.start_time,2)}s")
            
            # add button presses
            if self.button:
    
                # get dataframe with responses
                self.response_df = self.data.loc[(self.data['event_type'] == "response") & (self.data['response'] != 'space')]
    
                # get the onset times
                self.response_times = self.response_df['onset'].values[...,np.newaxis]
    
                # stack onset times
                self.onset_times = np.vstack([self.onset_times, self.response_times])
    
                # make a condition column
                self.response_condition = self.response_df['response'].values[...,np.newaxis]
    
                # stack it onto existing condition array
                self.condition = np.vstack([self.condition, self.response_condition])
            
            # check if we should include other events
            if isinstance(self.add_events, str):
                self.add_events = [self.add_events]
    
            if isinstance(self.event_names, str):
                self.event_names = [self.event_names]            
    
            if isinstance(self.add_events, list):
                if isinstance(self.event_names, list):
                    if len(self.event_names) != len(self.add_events):
                        raise ValueError(f"Length ({len(self.add_events)}) of added events {self.add_events} does not equal the length ({len(self.event_names)}) of requested event names {self.event_names}")
                else:
                    self.event_names = self.add_events.copy()
    
                for ix,ev in enumerate(self.add_events):
                    ev_times = np.array([ii for ii in np.unique(self.data[ev].values)])
    
                    # filter for nan (https://stackoverflow.com/a/11620982)
                    ev_times = ev_times[~np.isnan(ev_times)][...,np.newaxis]
    
                    # create condition
                    ev_names = np.full(ev_times.shape, self.event_names[ix])
    
                    # add times and names to array
                    self.onset_times = np.vstack((self.onset_times, ev_times))
                    self.condition = np.vstack((self.condition, ev_names))
    
            # check if we should add duration (can't be used in combination with add_events)
            if not skip_duration:
                if isinstance(self.add_events, list):
                    raise TypeError(f"Cannot do this operation because I don't know the durations for the added events. Please consider using 'stim_duration!={self.stim_duration}' or 'add_events=None'")
                self.onset = np.hstack((self.onset_times, self.condition, self.durations))
            else:
                self.onset = np.hstack((self.onset_times, self.condition))
    
            # sort array based on onset times (https://stackoverflow.com/a/2828121)        
            self.onset = self.onset[self.onset[:,0].argsort()]
    
            # add eyeblinks
            if isinstance(self.blinks, np.ndarray) or isinstance(self.blinks, str):
    
                if self.verbose:
                    print(" Including eyeblinks")
    
                if isinstance(self.blinks, np.ndarray):
                    self.eye_blinks = self.blinks
                elif isinstance(self.blinks, str):
                    if self.blinks.endwith(".npy"):
                        self.eye_blinks = np.load(self.blinks)
                    else:
                        raise ValueError(f"Could not recognize type of {self.blinks}. Should be numpy array or string to numpy file")
    
                self.eye_blinks = self.eye_blinks.astype('object').flatten()
                tmp = self.onset[:,0].flatten()
    
                # combine and sort timings
                comb = np.concatenate((self.eye_blinks, tmp))
                comb = np.sort(comb)[...,np.newaxis]
    
                # add back event types by checking timing values in both arrays
                event_array = []
                for ii in comb:
    
                    if ii in self.onset:
                        idx = np.where(self.onset == ii)[0][0]
                        event_array.append(self.onset[idx][-1])
                    else:
                        idx = np.where(self.eye_blinks == ii)[0]
                        event_array.append('blink')
    
                event_array = np.array(event_array)[...,np.newaxis]
    
                self.onset = np.concatenate((comb, event_array), axis=1)
    
            # correct for start time of experiment and deleted time due to removal of inital volumes
            self.onset[:, 0] = self.onset[:, 0] - (self.start_time + delete_time)
    
            if self.verbose:
                print(f" Cutting {round(self.start_time + delete_time,2)}s from onsets")
    
                if not skip_duration:
                    print(f" Avg duration = {round(self.durations.mean(),2)}s")
    
            # make dataframe
            if skip_duration:
                columns = ['onset', 'event_type']
            else:
                columns = ['onset', 'event_type', 'duration']
                
            self.onset_df = self.index_onset(self.onset, columns=columns, subject=self.sub, run=run)
    
            # check if we should do reaction times
            if self.RTs:
                if not isinstance(self.RT_relative_to, str):
                    raise ValueError(f"Need a reference column to calculate reaction times (RTs), not '{self.RT_relative_to}'")
                
                # get response times
                if not hasattr(self, "response_df"):
                    self.response_df = self.data.loc[(self.data['event_type'] == "response") & (self.data['response'] != 'space')]
                
                # fetch trials were target happened
                self.id_target = {}
                self.id_no_target = []
                self.false_alarms = []
                self.correct_rejection = []
                for idx in self.trimmed['trial_nr'].values:
    
                    # cross-check
                    trial_overview = self.data.loc[(self.data['trial_nr'] == idx)]
                    
                    # get the values of reference column; skip if all elements are nan
                    if self.RT_relative_to != "start":
                        ref_values = np.unique(trial_overview[self.RT_relative_to].values)
                        has_nans = np.isnan(ref_values)
    
                        # should be safe as any response will not have a value in reference column
                        if False in has_nans:
                            
                            # increase number of target to remain agnostic about design
                            id_no_nan = np.where(has_nans == False)[0]
                            if len(id_no_nan) != 0:
                                self.id_target[idx] = ref_values[id_no_nan]
                        else:
                            # append no targets
                            self.id_no_target.append(idx)
                            
                            # false alarm = no target but response
                            if 'response' in list(trial_overview['event_type'].values):
                                self.false_alarms.append(idx)
                            else:
                                # correct rejection = no target and no response
                                self.correct_rejection.append(idx)
    
                self.rts = []
                self.hits = []
                for target_trial in self.id_target.keys():
            
                    # get corresponding reference value
                    if self.RT_relative_to != "start":
                        ref_value = self.id_target[target_trial]
                    else:
                        ref_value = trial_overview.loc[(trial_overview['event_type'] == 'stim')]['onset'].values
                    
                    # get response value (length will be 0 if dataframe is empty)
                    response_value = self.response_df.loc[(self.response_df['trial_nr'] == target_trial)]['onset'].values
    
                    # check if lengths of reference value and response value are equal
                    if len(response_value) == len(ref_value):
                        rt = response_value - ref_value
    
                        # ignore negative reaction times..
                        if rt > 0:
                            self.rts.append(rt)
                    
                if len(self.rts) >= 1:
                    self.rts = np.array(self.rts)
                else:
                    self.rts = np.array([0])
    
                if len(self.id_target) != 0:
                    self.hits = len(self.rts)/len(self.id_target)
                    self.miss = (len(self.id_target)-len(self.rts))/len(self.id_target)
                    self.fa = len(self.false_alarms)/len(self.id_no_target)
                    self.cr = len(self.correct_rejection)/len(self.id_no_target)
    
                    # set FA to something if 0 to avoid d' = inf
                    if self.fa == float(0):
                        self.fa = 0.5*(1/len(self.trimmed['trial_nr'].values))
                        add_fa_txt = f"[added {round(self.fa,2)} to avoid d' = inf]"
                    else:
                        add_fa_txt = ""
    
                    # set HITS to something if 0 to avoid d' = inf
                    if self.hits == float(0):
                        self.hits = 0.5*(1/len(self.trimmed['trial_nr'].values))
                        add_hits_txt = f"[added {round(self.hits,2)} to avoid d' = inf]"
                    else:
                        add_hits_txt = ""
    
                    # calculate d-prime
                    self.hitZ = stats.norm.ppf(self.hits)
                    self.faZ = stats.norm.ppf(self.fa)
                    self.dPrime = self.hitZ-self.faZ
    
                    # d-prime=0 is considered as pure guessing.
                    # d-prime=1 is considered as good measure of signal sensitivity/detectability.
                    # d-prime=2 is considered as awesome.
                    
                    self.accuracy_df = self.index_accuracy(np.array([self.hits, self.miss, self.fa, self.cr, self.dPrime], dtype=float)[np.newaxis,...], columns=['hits','miss','fa','cr','d_prime'], subject=self.sub, run=run)
    
                if self.verbose:
                    if hasattr(self, 'dPrime'):
                        print(f" Hits:\t{round(self.hits,2)}\t({len(self.rts)}/{len(self.id_target)})\t{add_hits_txt}")
                        print(f" Miss:\t{round(self.miss,2)}\t({(len(self.id_target)-len(self.rts))}/{len(self.id_target)})")
                        print(f" FA:\t{round(self.fa,2)}\t({len(self.false_alarms)}/{len(self.id_no_target)})\t{add_fa_txt}")
                        print(f" CR:\t{round(self.cr,2)}\t({len(self.correct_rejection)}/{len(self.id_no_target)})")
                        print(f" D':\t{round(self.dPrime,2)}\t(0=guessing;1=good;2=awesome)")
                    
                    print(f" Average reaction time (RT) = {round(self.rts.mean(),2)}s (relative to '{self.RT_relative_to}').")
    
                # parse into dataframe
                self.rts_df = self.index_rts(self.rts, columns=["RTs"], subject=self.sub, run=run)
    
    
        @staticmethod
        def index_onset(array, columns=None, subject=1, run=1, TR=0.105, set_index=False):
            
            if columns == None:
                df = pd.DataFrame(array)
            else:
                df = pd.DataFrame(array, columns=columns)
                
            df['subject']       = subject
            df['run']           = run
            df['event_type']    = df['event_type'].astype(str)
            df['onset']         = df['onset'].astype(float)
    
            # check if we got duration
            try:
                df['duration'] = df['duration'].astype(float)  
            except:
                pass
    
            if set_index:
                return df.set_index(['subject', 'run', 'event_type'])
            else:
                return df        
    
        @staticmethod
        def index_rts(array, columns=None, subject=1, run=1, set_index=False):
            
            if columns == None:
                df = pd.DataFrame(array)
            else:
                df = pd.DataFrame(array, columns=columns)
                
            df['subject']   = subject
            df['run']       = run
            df['RTs']       = df['RTs'].astype(float)
    
            if set_index:
                return df.set_index(['subject', 'run'])
            else:
                return df        
    
        @staticmethod
        def index_accuracy(array, columns=None, subject=1, run=1, set_index=False):
            
            if columns == None:
                df = pd.DataFrame(array)
            else:
                df = pd.DataFrame(array, columns=columns)
                
            df['subject']   = subject
            df['run']       = run
    
            if set_index:
                return df.set_index(['subject', 'run'])
            else:
                return df                   
    
        def get_onset_df(self, index=False):
            """Return the indexed DataFrame containing onset times"""
    
            if index:
                return self.onset_df.set_index(['subject', 'run', 'event_type'])
            else:
                return self.onset_df
    
        def get_rts_df(self, index=False):
            """Return the indexed DataFrame containing reaction times"""
    
            if index:
                return self.rts_df.set_index(['subject', 'run'])
            else:
                return self.rts_df
    
        def get_accuracy(self, index=False):
            """Return the indexed DataFrame containing reaction times"""
    
            if index:
                return self.accuracy_df.set_index(['subject', 'run'])
            else:
                return self.accuracy_df             
    
        def onsets_to_fsl(self, fmt='3-column', amplitude=1, output_base=None):
            """onsets_to_fsl
    
            This function creates a text file with a single column containing the onset times of a given condition. Such a file can be used for SPM or FSL modeling, but it should be noted that the onset times have been corrected for the deleted volumes at the beginning. So make sure your inputting the correct functional data in these cases.
    
            Parameters
            ----------
            subject: int
                subject number you'd like to have the onset times for
            run: int
                run number you'd like to have the onset times for
            condition: str
                name of the condition you'd like to have the onset times for as specified in the data frame
            fname: str
                path to output name for text file
    
            Returns
            ----------
            str
                if `fname` was specified, a new file will be created and `fname` will be returned as string pointing to that file
    
            list
                if `fname` was *None*, the list of onset times will be returned
            """
    
            onsets = self.df_onsets.copy()
            subj_list = self.get_subjects(onsets)
            for sub in subj_list:
                df = utils.select_from_df(onsets, expression=f"subject = {sub}")
    
                n_runs = self.get_runs(df)
    
                for run in n_runs:
                    onsets_per_run = utils.select_from_df(df, expression=f"run = {run}")
                    events_per_run = self.get_events(onsets_per_run)
    
                    for ix, ev in enumerate(events_per_run):
    
                        onsets_per_event = utils.select_from_df(onsets_per_run, expression=f"event_type = {events_per_run[ix]}")
                        if output_base == None:
                            if isinstance(self.tsv_file, list):
                                outdir = os.path.dirname(self.tsv_file[0])
                            elif isinstance(self.tsv_file, str):
                                outdir = os.path.dirname(self.tsv_file)
                            else:
                                outdir = os.getcwd()
    
                            fname = opj(outdir, f"{ev}_run-{run}.txt")
                        else:
                            fname = f"{output_base}{ix+1}_run-{run}.txt"
    
                        # fetch the onsets
                        event_onsets = onsets_per_event['onset'].values[..., np.newaxis]
                        if fmt == "3-column":
    
                            # check if we got duration
                            if 'duration' in list(onsets_per_event.columns):
                                duration_arr = onsets_per_event['duration'].values[..., np.newaxis]
                            else:
                                duration_arr = np.ones_like(onsets_per_event)
    
                            amplitude_arr = np.full_like(event_onsets, amplitude)
                            three_col = np.hstack((event_onsets, duration_arr, amplitude_arr))
    
                            print(f"Writing {fname}; {three_col.shape}")
                            np.savetxt(fname, three_col, delimiter='\t', fmt='%1.3f')
                        else:
                            np.savetxt(fname, event_onsets, delimiter='\t', fmt='%1.3f')
    
        @staticmethod
        def get_subjects(df):
            try:
                df = df.reset_index()
            except:
                pass
    
            return np.unique(df['subject'].values)
    
        @staticmethod
        def get_runs(df):
            try:
                df = df.reset_index()
            except:
                pass
    
            return np.unique(df['run'].values)
    
        @staticmethod
        def get_events(df):
            try:
                df = df.reset_index()
            except:
                pass
    
            return np.unique(df['event_type'].values)
    
    opened by luisarai 0
  • Add `shift` parameter to call_createimg

    Add `shift` parameter to call_createimg

    Sometimes we need to move the slices up/down depending on how the saturation slabs and B1 behave. This messes the beam-image up, so there needs to be an argument that says how much shift there was so we can correctly estimate the position of the line

    opened by gjheij 0
Owner
Jurjen Heij
Jurjen Heij
The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

BMC The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing". BibTex entry available here. B

Orange 383 Dec 16, 2022
Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

t5-japanese Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts. The following is a list of models that

Kimio Kuramitsu 1 Dec 13, 2021
Pre-Trained Image Processing Transformer (IPT)

Pre-Trained Image Processing Transformer (IPT) By Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Cha

HUAWEI Noah's Ark Lab 332 Dec 18, 2022
Data Preparation, Processing, and Visualization for MoVi Data

MoVi-Toolbox Data Preparation, Processing, and Visualization for MoVi Data, https://www.biomotionlab.ca/movi/ MoVi is a large multipurpose dataset of

Saeed Ghorbani 51 Nov 27, 2022
Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

InversePrompting Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting Code: The code is provided in the "chinese_ip"

THUDM 101 Dec 16, 2022
The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

This repository is the official PyTorch implementation of SAINT. Find the paper on arxiv SAINT: Improved Neural Networks for Tabular Data via Row Atte

Gowthami Somepalli 284 Dec 21, 2022
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 7, 2022
TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Microsoft 1.3k Dec 30, 2022
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

null 1.9k Dec 28, 2022
Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

MOT Tracked object bounding box association (CenterTrack++) New association method based on CenterTrack. Two new branches (Tracked Size and IOU) are a

null 36 Oct 4, 2022
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

DeCLIP Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. Our paper is available in arxiv Updates ** Ou

Sense-GVT 470 Dec 30, 2022
CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

CLIP-Indonesian CLIP (Radford et al., 2021) is a multimodal model that can connect images and text by training a vision encoder and a text encoder joi

Galuh 17 Mar 10, 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training [Arxiv] VideoMAE: Masked Autoencoders are Data-Efficient Learne

Multimedia Computing Group, Nanjing University 697 Jan 7, 2023
Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op

The Apache Software Foundation 34.7k Jan 4, 2023
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 8, 2022
Automatically download the cwru data set, and then divide it into training data set and test data set

Automatically download the cwru data set, and then divide it into training data set and test data set.自动下载cwru数据集,然后分训练数据集和测试数据集

null 6 Jun 27, 2022
QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place — so you can focus on building the next big thing.

QuakeLabeler Quake Labeler was born from the need for seismologists and developers who are not AI specialists to easily, quickly, and independently bu

Hao Mai 15 Nov 4, 2022
Simple Python project using Opencv and datetime package to recognise faces and log attendance data in a csv file.

Attendance-System-based-on-Facial-recognition-Attendance-data-stored-in-csv-file- Simple Python project using Opencv and datetime package to recognise

null 3 Aug 9, 2022
Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

THUNLP 75 Nov 2, 2022