line scanning repository

This repository contains all of the tools used during the acquisition and postprocessing of line scanning data at the Spinoza Centre for Neuroimaging in Amsterdam. The script master controls the modules prefixed by spinoza_, which in turn call upon various scripts in the utils and bin directory. The scripts in the latter folders are mostly helper scripts to make life a tad easier. The repository contains a mix of languages in bash, python, and matlab.

In active development - do not use unless otherwise instructed by repo owners

Documentation for this package can be found at readthedocs (not up to date)

Policy & To Do

install using python setup.py develop
Docstrings in numpy format.
PEP8 - please set your editor to autopep8 on save!
Documentation with Sphinx (WIP)
Explore options to streamline code
Examples of applications for package (integration of pycortex & pRFpy)

overview of the pipeline

how to set up

Clone the repository: git clone https://github.com/gjheij/linescanning.git.

To setup the bash environment, edit setup file linescanning/shell/spinoza_setup:

line 76: add the path to your matlab installation if available (should be, for better anatomicall preprocessing)
line 87: add the path to your SPM installation
line 92: add your project name
line 97: add the path to project name as defined in line 92
line 102: add whether you're using (ME)MP(2)RAGE. This is required because the pipeline allows the usage of the average of an MP2RAGE and MP2RAGEME acquisition
line 105: add which type of data you're using (generally this will be the same as line 102)

Go to linescanning/shell and hit ./spinoza_setup setup setup. This will print a set of instructions that you need to follow. If all goes well this will make all the script executable, set all the paths, and install the python modules. The repository comes with a conda environment file, which can be activated with: conda create --name myenv --file environment.yml.

How to plan the line

We currently aim to have two separate sessions: in the first session, we acquire high resolution anatomical scans and perform a population receptive field (pRF-) mapping paradigm (Dumoulin and Wandell, 2008) to delineate the visual field. After this session, we create surfaces of the brain and map the pRFs onto that via fMRIprep and pRFpy. We then select a certain vertex based on the parameters extracted from the pRF-mapping: eccentricity, size, and polar angle. Using these parameters, we can find an optimal vertex. We can obtain the vertex position, while by calculating the normal vector, we obtain the orientation that line should have (parellel to the normal vector and through the vertex point). Combining this information, we know how the line should be positioned in the first session anatomy. In the second session, we first acquire a low-resolution MP2RAGE with the volume coil. This is exported and registered to the first session anatomy during the second session to obtain the translations and rotations needed to map the line from the first session anatomy to the currently active second session by inputting the values in the MR-console. This procedure from registration to calculation of MR-console values is governed by spinoza_lineplanning and can be called with master -m 00 -s -h.

def check_input_is_list(obj, var=None, list_element=0):

    if hasattr(obj, var):
        attr = getattr(obj, var)
    else:
        raise ValueError(f"Class does not have '{var}'-attribute")

    obj_attr = "func_file"
    if hasattr(obj, "tsv_file"):
        obj_attr = "tsv_file"
    
    if isinstance(attr, list) or isinstance(attr, np.ndarray):
        if len(attr) != len(getattr(obj,obj_attr)):
            raise ValueError(f"Length of '{var}' ({len(attr)}) does not match number of func files ({len(getattr(obj,obj_attr))}). Either specify a list of equal lenghts or 1 integer value for all volumes")

        return attr[list_element]
    else:
        return attr

class ParseExpToolsFile(ParseEyetrackerFile):

    """ParseExpToolsFile()

    Class for parsing tsv-files created during experiments with Exptools2. The class will read in the file, read when the experiment actually started, correct onset times for this start time and time deleted because of removing the first few volumes (to do this correctly, set the `TR` and `deleted_first_timepoints`). You can also provide a numpy array/file containing eye blinks that should be added to the onset times in real-world time (seconds). In principle, it will return a pandas DataFrame indexed by subject and run that can be easily concatenated over runs. This function relies on the naming used when programming the experiment. In the `session.py` file, you should have created `phase_names=['iti', 'stim']`; the class will use these things to parse the file.

    Parameters
    ----------
    tsv_file: str, list
        path pointing to the output file of the experiment
    subject: int
        subject number in the returned pandas DataFrame (should start with 1, ..., n)
    run: int
        run number you'd like to have the onset times for
    button: bool
        boolean whether to include onset times of button responses (default is false). ['space'] will be ignored as response
    blinks: str, np.ndarray
        string or array containing the onset times of eye blinks as extracted with hedfpy
    TR: float
        repetition time to correct onset times for deleted volumes
    deleted_first_timepoints: int
        number of volumes to delete to correct onset times for deleted volumes. Can be specified for each individual run if `tsv_file` is a list
    use_bids: bool, optional
        If true, we'll read BIDS-components such as 'sub', 'run', 'task', etc from the input file and use those as indexers, rather than sequential 1,2,3.
    funcs: str, list, optional
        List of functional files that is being passed down down to :class:`linescanning.dataset.ParseEyetrackerFile`. Required for correct resampling to functional space
    edfs: str, list, optional
        List of eyetracking output files that is being passed down down to :class:`linescanning.dataset.ParseEyetrackerFile`.
    verbose: bool, optional
        Print details to the terminal, default is False
    phase_onset: int, optional
        Which phase of exptools-trial should be considered the actual stimulus trial. Usually, `phase_onset=0` means the interstimulus interval. Therefore, default = 1
    stim_duration: str, int, optional
        If desired, add stimulus duration to onset dataframe. Can be one of 'None', 'stim' (to use duration from exptools'  log file) or any given integer
    add_events: str, list, optional
        Add additional events to onset dataframe. Must be an existing column in the exptools log file. For intance, `responses` and `event_type = stim` are read in by default, but if we have a separate column containing the onset of some target (e.g., 'target_onset'), we can add these times to the dataframe with `add_events='target_onset'`.
    event_names: str, list, optional
        Custom names for manually added events through `add_events` if the column names are not the names you want to use in the dataframe. E.g., if I find `target_onset` too long of a name, I can specify `event_names='target'`. If `add_events` is a list, then `event_names` must be a list of equal length if custom names are desired. By default we'll take the names from `add_events`
    RTs: bool, optional
        If we have a design that required some response to a stimulus, we can request the reaction times. Default = False
    RT_relative_to: str, optional
        If `RTs=True`, we need to know relative to what time the button response should be offset. Only correct responses are considered, as there's a conditional statement that requires the present of the reference time (e.g., `target_onset`) and button response. If there's a response but no reference time, the reaction time cannot be calculated. If you do not have a separate reference time column, you can specify `RT_relative_to='start'` to calculate the reaction time relative to onset time. If `RT_relative_to != 'start'`, I'll assume you had a target in your experiment in X/n_trials. From this, we can calculate the accuracy and save that to `self.df_accuracy`, while reaction times are saved in`self.df_rts`

    Examples
    ----------
    >>> from linescanning.utils import ParseExpToolsFile
    >>> file = 'some/path/to/exptoolsfile.tsv'
    >>> parsed_file = ParseExpToolsFile(file, subject=1, run=1, button=True)
    >>> onsets = parsed_file.get_onset_df()

    >>> # If you want to get all your subjects and runs in 1 nideconv compatible dataframe, you can do something like this:
    >>> onsets = []
    >>> run_subjects = ['001','002','003']
    >>> for sub in run_subjects:
    >>>     path_tsv_files = os.path.join(f'some/path/sub-{sub}')
    >>>     f = os.listdir(path_tsv_files)
    >>>     nr_runs = []; [nr_runs.append(os.path.join(path_tsv_files, r)) for r in f if "events.tsv" in r]
    >>> 
    >>>     for run in range(1,len(nr_runs)+1):
    >>>         sub_idx = run_subjects.index(sub)+1
    >>>         onsets.append(ParseExpToolsFile(df_onsets, subject=sub_idx, run=run).get_onset_df())
    >>>         
    >>> onsets = pd.concat(onsets).set_index(['subject', 'run', 'event_type'])
    """

    def __init__(
        self, 
        tsv_file, 
        subject=1, 
        run=1, 
        button=False, 
        blinks=None, 
        RTs=False,
        RT_relative_to=None,
        TR=0.105, 
        deleted_first_timepoints=0, 
        edfs=None, 
        funcs=None, 
        use_bids=True,
        verbose=False,
        phase_onset=1,
        stim_duration=None,
        add_events=None,
        event_names=None,
        **kwargs):

        self.tsv_file                       = tsv_file
        self.sub                            = subject
        self.run                            = run
        self.TR                             = TR
        self.deleted_first_timepoints       = deleted_first_timepoints
        self.button                         = button
        self.blinks                         = blinks
        self.funcs                          = funcs
        self.edfs                           = edfs
        self.use_bids                       = use_bids
        self.verbose                        = verbose
        self.phase_onset                    = phase_onset
        self.stim_duration                  = stim_duration
        self.RTs                            = RTs
        self.RT_relative_to                 = RT_relative_to
        self.add_events                     = add_events
        self.event_names                    = event_names
        self.__dict__.update(kwargs)

        if self.edfs != None:
            super().__init__(
                self.edfs, 
                subject=self.sub, 
                func_file=self.funcs, 
                TR1=self.TR, 
                use_bids=self.use_bids, 
                verbose=self.verbose)
        else:
            self.include_blinks = False

        if self.verbose:
            print("\nEXPTOOLS")

        if isinstance(self.tsv_file, str):
            self.tsv_file = [self.tsv_file]

        if isinstance(self.tsv_file, list):
            df_onsets = []
            df_rts = []
            df_accuracy = []
            for run, onset_file in enumerate(self.tsv_file):

                if self.use_bids:
                    bids_comps = utils.split_bids_components(onset_file)
                    for el in ['sub', 'run']:
                        setattr(self, el, bids_comps[el])
                else:
                    self.run = run+1

                # include eyeblinks?
                if self.include_blinks:
                    self.blinks = self.fetch_blinks_run(run=self.run)

                # check if we got different nr of vols to delete per run
                delete_vols = check_input_is_list(self, "deleted_first_timepoints", list_element=run)

                # check if we got different stimulus durations per run
                duration = check_input_is_list(self, var="stim_duration", list_element=run)

                # read in the exptools-file
                self.preprocess_exptools_file(
                    onset_file, 
                    run=self.run, 
                    delete_vols=delete_vols, 
                    phase_onset=self.phase_onset,
                    duration=duration)

                # append to df
                df_onsets.append(self.get_onset_df(index=False))

                # check if we got RTs
                try:
                    df_rts.append(self.get_rts_df(index=False))
                except:
                    pass

                # check if we got accuracy (only if RT_relative_to != 'start')
                try:
                    df_accuracy.append(self.get_accuracy(index=False))
                except:
                    pass


            # concatemate df
            self.df_onsets = pd.concat(df_onsets).set_index(['subject', 'run', 'event_type'])

            # rts
            try:
                self.df_rts = pd.concat(df_rts).set_index(['subject', 'run'])
            except:
                pass

            # accuracy
            try:
                self.df_accuracy = pd.concat(df_accuracy).set_index(['subject', 'run'])
            except:
                pass

        # get events per run
        self.events_per_run = self.events_per_run()


    def events_per_run(self):
        n_runs = np.unique(self.df_onsets.reset_index()['run'].values)
        events = {}
        for run in n_runs:
            df = utils.select_from_df(self.df_onsets, expression=f"run = {run}", index=None)
            events[run] = np.unique(df['event_type'].values)

        return events

    def events_single_run(self, run=1):
        return self.events_per_run[run]

    def preprocess_exptools_file(self, tsv_file, run=1, delete_vols=0, phase_onset=1, duration=None):
        
        if self.verbose:
            print(f"Preprocessing {tsv_file}")
        with open(tsv_file) as f:
            self.data = pd.read_csv(f, delimiter='\t')

        # trim onsets to first 't'
        delete_time         = delete_vols*self.TR
        self.start_time     = float(self.data.loc[(self.data['event_type'] == "pulse") & (self.data['phase'] == 0)]['onset'].values[0])
        self.trimmed        = self.data.loc[(self.data['event_type'] == "stim") & (self.data['phase'] == phase_onset)].iloc[1:,:]
        self.onset_times    = self.trimmed['onset'].values[...,np.newaxis]

        skip_duration = False
        if isinstance(duration, float) or isinstance(duration, int):
            self.durations = np.full_like(self.onset_times, float(duration))
        elif duration == None:
            skip_duration = True
        else:
            self.durations = self.trimmed['duration'].values[...,np.newaxis]
        
        self.condition = self.trimmed['condition'].values[..., np.newaxis]
        if self.verbose:
            print(f" 1st 't' @{round(self.start_time,2)}s")
        
        # add button presses
        if self.button:

            # get dataframe with responses
            self.response_df = self.data.loc[(self.data['event_type'] == "response") & (self.data['response'] != 'space')]

            # get the onset times
            self.response_times = self.response_df['onset'].values[...,np.newaxis]

            # stack onset times
            self.onset_times = np.vstack([self.onset_times, self.response_times])

            # make a condition column
            self.response_condition = self.response_df['response'].values[...,np.newaxis]

            # stack it onto existing condition array
            self.condition = np.vstack([self.condition, self.response_condition])
        
        # check if we should include other events
        if isinstance(self.add_events, str):
            self.add_events = [self.add_events]

        if isinstance(self.event_names, str):
            self.event_names = [self.event_names]            

        if isinstance(self.add_events, list):
            if isinstance(self.event_names, list):
                if len(self.event_names) != len(self.add_events):
                    raise ValueError(f"Length ({len(self.add_events)}) of added events {self.add_events} does not equal the length ({len(self.event_names)}) of requested event names {self.event_names}")
            else:
                self.event_names = self.add_events.copy()

            for ix,ev in enumerate(self.add_events):
                ev_times = np.array([ii for ii in np.unique(self.data[ev].values)])

                # filter for nan (https://stackoverflow.com/a/11620982)
                ev_times = ev_times[~np.isnan(ev_times)][...,np.newaxis]

                # create condition
                ev_names = np.full(ev_times.shape, self.event_names[ix])

                # add times and names to array
                self.onset_times = np.vstack((self.onset_times, ev_times))
                self.condition = np.vstack((self.condition, ev_names))

        # check if we should add duration (can't be used in combination with add_events)
        if not skip_duration:
            if isinstance(self.add_events, list):
                raise TypeError(f"Cannot do this operation because I don't know the durations for the added events. Please consider using 'stim_duration!={self.stim_duration}' or 'add_events=None'")
            self.onset = np.hstack((self.onset_times, self.condition, self.durations))
        else:
            self.onset = np.hstack((self.onset_times, self.condition))

        # sort array based on onset times (https://stackoverflow.com/a/2828121)        
        self.onset = self.onset[self.onset[:,0].argsort()]

        # add eyeblinks
        if isinstance(self.blinks, np.ndarray) or isinstance(self.blinks, str):

            if self.verbose:
                print(" Including eyeblinks")

            if isinstance(self.blinks, np.ndarray):
                self.eye_blinks = self.blinks
            elif isinstance(self.blinks, str):
                if self.blinks.endwith(".npy"):
                    self.eye_blinks = np.load(self.blinks)
                else:
                    raise ValueError(f"Could not recognize type of {self.blinks}. Should be numpy array or string to numpy file")

            self.eye_blinks = self.eye_blinks.astype('object').flatten()
            tmp = self.onset[:,0].flatten()

            # combine and sort timings
            comb = np.concatenate((self.eye_blinks, tmp))
            comb = np.sort(comb)[...,np.newaxis]

            # add back event types by checking timing values in both arrays
            event_array = []
            for ii in comb:

                if ii in self.onset:
                    idx = np.where(self.onset == ii)[0][0]
                    event_array.append(self.onset[idx][-1])
                else:
                    idx = np.where(self.eye_blinks == ii)[0]
                    event_array.append('blink')

            event_array = np.array(event_array)[...,np.newaxis]

            self.onset = np.concatenate((comb, event_array), axis=1)

        # correct for start time of experiment and deleted time due to removal of inital volumes
        self.onset[:, 0] = self.onset[:, 0] - (self.start_time + delete_time)

        if self.verbose:
            print(f" Cutting {round(self.start_time + delete_time,2)}s from onsets")

            if not skip_duration:
                print(f" Avg duration = {round(self.durations.mean(),2)}s")

        # make dataframe
        if skip_duration:
            columns = ['onset', 'event_type']
        else:
            columns = ['onset', 'event_type', 'duration']
            
        self.onset_df = self.index_onset(self.onset, columns=columns, subject=self.sub, run=run)

        # check if we should do reaction times
        if self.RTs:
            if not isinstance(self.RT_relative_to, str):
                raise ValueError(f"Need a reference column to calculate reaction times (RTs), not '{self.RT_relative_to}'")
            
            # get response times
            if not hasattr(self, "response_df"):
                self.response_df = self.data.loc[(self.data['event_type'] == "response") & (self.data['response'] != 'space')]
            
            # fetch trials were target happened
            self.id_target = {}
            self.id_no_target = []
            self.false_alarms = []
            self.correct_rejection = []
            for idx in self.trimmed['trial_nr'].values:

                # cross-check
                trial_overview = self.data.loc[(self.data['trial_nr'] == idx)]
                
                # get the values of reference column; skip if all elements are nan
                if self.RT_relative_to != "start":
                    ref_values = np.unique(trial_overview[self.RT_relative_to].values)
                    has_nans = np.isnan(ref_values)

                    # should be safe as any response will not have a value in reference column
                    if False in has_nans:
                        
                        # increase number of target to remain agnostic about design
                        id_no_nan = np.where(has_nans == False)[0]
                        if len(id_no_nan) != 0:
                            self.id_target[idx] = ref_values[id_no_nan]
                    else:
                        # append no targets
                        self.id_no_target.append(idx)
                        
                        # false alarm = no target but response
                        if 'response' in list(trial_overview['event_type'].values):
                            self.false_alarms.append(idx)
                        else:
                            # correct rejection = no target and no response
                            self.correct_rejection.append(idx)

            self.rts = []
            self.hits = []
            for target_trial in self.id_target.keys():
        
                # get corresponding reference value
                if self.RT_relative_to != "start":
                    ref_value = self.id_target[target_trial]
                else:
                    ref_value = trial_overview.loc[(trial_overview['event_type'] == 'stim')]['onset'].values
                
                # get response value (length will be 0 if dataframe is empty)
                response_value = self.response_df.loc[(self.response_df['trial_nr'] == target_trial)]['onset'].values

                # check if lengths of reference value and response value are equal
                if len(response_value) == len(ref_value):
                    rt = response_value - ref_value

                    # ignore negative reaction times..
                    if rt > 0:
                        self.rts.append(rt)
                
            if len(self.rts) >= 1:
                self.rts = np.array(self.rts)
            else:
                self.rts = np.array([0])

            if len(self.id_target) != 0:
                self.hits = len(self.rts)/len(self.id_target)
                self.miss = (len(self.id_target)-len(self.rts))/len(self.id_target)
                self.fa = len(self.false_alarms)/len(self.id_no_target)
                self.cr = len(self.correct_rejection)/len(self.id_no_target)

                # set FA to something if 0 to avoid d' = inf
                if self.fa == float(0):
                    self.fa = 0.5*(1/len(self.trimmed['trial_nr'].values))
                    add_fa_txt = f"[added {round(self.fa,2)} to avoid d' = inf]"
                else:
                    add_fa_txt = ""

                # set HITS to something if 0 to avoid d' = inf
                if self.hits == float(0):
                    self.hits = 0.5*(1/len(self.trimmed['trial_nr'].values))
                    add_hits_txt = f"[added {round(self.hits,2)} to avoid d' = inf]"
                else:
                    add_hits_txt = ""

                # calculate d-prime
                self.hitZ = stats.norm.ppf(self.hits)
                self.faZ = stats.norm.ppf(self.fa)
                self.dPrime = self.hitZ-self.faZ

                # d-prime=0 is considered as pure guessing.
                # d-prime=1 is considered as good measure of signal sensitivity/detectability.
                # d-prime=2 is considered as awesome.
                
                self.accuracy_df = self.index_accuracy(np.array([self.hits, self.miss, self.fa, self.cr, self.dPrime], dtype=float)[np.newaxis,...], columns=['hits','miss','fa','cr','d_prime'], subject=self.sub, run=run)

            if self.verbose:
                if hasattr(self, 'dPrime'):
                    print(f" Hits:\t{round(self.hits,2)}\t({len(self.rts)}/{len(self.id_target)})\t{add_hits_txt}")
                    print(f" Miss:\t{round(self.miss,2)}\t({(len(self.id_target)-len(self.rts))}/{len(self.id_target)})")
                    print(f" FA:\t{round(self.fa,2)}\t({len(self.false_alarms)}/{len(self.id_no_target)})\t{add_fa_txt}")
                    print(f" CR:\t{round(self.cr,2)}\t({len(self.correct_rejection)}/{len(self.id_no_target)})")
                    print(f" D':\t{round(self.dPrime,2)}\t(0=guessing;1=good;2=awesome)")
                
                print(f" Average reaction time (RT) = {round(self.rts.mean(),2)}s (relative to '{self.RT_relative_to}').")

            # parse into dataframe
            self.rts_df = self.index_rts(self.rts, columns=["RTs"], subject=self.sub, run=run)


    @staticmethod
    def index_onset(array, columns=None, subject=1, run=1, TR=0.105, set_index=False):
        
        if columns == None:
            df = pd.DataFrame(array)
        else:
            df = pd.DataFrame(array, columns=columns)
            
        df['subject']       = subject
        df['run']           = run
        df['event_type']    = df['event_type'].astype(str)
        df['onset']         = df['onset'].astype(float)

        # check if we got duration
        try:
            df['duration'] = df['duration'].astype(float)  
        except:
            pass

        if set_index:
            return df.set_index(['subject', 'run', 'event_type'])
        else:
            return df        

    @staticmethod
    def index_rts(array, columns=None, subject=1, run=1, set_index=False):
        
        if columns == None:
            df = pd.DataFrame(array)
        else:
            df = pd.DataFrame(array, columns=columns)
            
        df['subject']   = subject
        df['run']       = run
        df['RTs']       = df['RTs'].astype(float)

        if set_index:
            return df.set_index(['subject', 'run'])
        else:
            return df        

    @staticmethod
    def index_accuracy(array, columns=None, subject=1, run=1, set_index=False):
        
        if columns == None:
            df = pd.DataFrame(array)
        else:
            df = pd.DataFrame(array, columns=columns)
            
        df['subject']   = subject
        df['run']       = run

        if set_index:
            return df.set_index(['subject', 'run'])
        else:
            return df                   

    def get_onset_df(self, index=False):
        """Return the indexed DataFrame containing onset times"""

        if index:
            return self.onset_df.set_index(['subject', 'run', 'event_type'])
        else:
            return self.onset_df

    def get_rts_df(self, index=False):
        """Return the indexed DataFrame containing reaction times"""

        if index:
            return self.rts_df.set_index(['subject', 'run'])
        else:
            return self.rts_df

    def get_accuracy(self, index=False):
        """Return the indexed DataFrame containing reaction times"""

        if index:
            return self.accuracy_df.set_index(['subject', 'run'])
        else:
            return self.accuracy_df             

    def onsets_to_fsl(self, fmt='3-column', amplitude=1, output_base=None):
        """onsets_to_fsl

        This function creates a text file with a single column containing the onset times of a given condition. Such a file can be used for SPM or FSL modeling, but it should be noted that the onset times have been corrected for the deleted volumes at the beginning. So make sure your inputting the correct functional data in these cases.

        Parameters
        ----------
        subject: int
            subject number you'd like to have the onset times for
        run: int
            run number you'd like to have the onset times for
        condition: str
            name of the condition you'd like to have the onset times for as specified in the data frame
        fname: str
            path to output name for text file

        Returns
        ----------
        str
            if `fname` was specified, a new file will be created and `fname` will be returned as string pointing to that file

        list
            if `fname` was *None*, the list of onset times will be returned
        """

        onsets = self.df_onsets.copy()
        subj_list = self.get_subjects(onsets)
        for sub in subj_list:
            df = utils.select_from_df(onsets, expression=f"subject = {sub}")

            n_runs = self.get_runs(df)

            for run in n_runs:
                onsets_per_run = utils.select_from_df(df, expression=f"run = {run}")
                events_per_run = self.get_events(onsets_per_run)

                for ix, ev in enumerate(events_per_run):

                    onsets_per_event = utils.select_from_df(onsets_per_run, expression=f"event_type = {events_per_run[ix]}")
                    if output_base == None:
                        if isinstance(self.tsv_file, list):
                            outdir = os.path.dirname(self.tsv_file[0])
                        elif isinstance(self.tsv_file, str):
                            outdir = os.path.dirname(self.tsv_file)
                        else:
                            outdir = os.getcwd()

                        fname = opj(outdir, f"{ev}_run-{run}.txt")
                    else:
                        fname = f"{output_base}{ix+1}_run-{run}.txt"

                    # fetch the onsets
                    event_onsets = onsets_per_event['onset'].values[..., np.newaxis]
                    if fmt == "3-column":

                        # check if we got duration
                        if 'duration' in list(onsets_per_event.columns):
                            duration_arr = onsets_per_event['duration'].values[..., np.newaxis]
                        else:
                            duration_arr = np.ones_like(onsets_per_event)

                        amplitude_arr = np.full_like(event_onsets, amplitude)
                        three_col = np.hstack((event_onsets, duration_arr, amplitude_arr))

                        print(f"Writing {fname}; {three_col.shape}")
                        np.savetxt(fname, three_col, delimiter='\t', fmt='%1.3f')
                    else:
                        np.savetxt(fname, event_onsets, delimiter='\t', fmt='%1.3f')

    @staticmethod
    def get_subjects(df):
        try:
            df = df.reset_index()
        except:
            pass

        return np.unique(df['subject'].values)

    @staticmethod
    def get_runs(df):
        try:
            df = df.reset_index()
        except:
            pass

        return np.unique(df['run'].values)

    @staticmethod
    def get_events(df):
        try:
            df = df.reset_index()
        except:
            pass

        return np.unique(df['event_type'].values)

Usage of pipeline requires bash>4.0

Some pieces of code will fail if the bash version is below 4.0. Could be made more explicit by specifying bash in front of the spinoza_* modules in the master script.

Could also add a check for bash version in spinoza_setup during installation. Update docs too.

opened by gjheij 9

Minor changes to be compatible with 3x MP2RAGE & T2

master:

if [[ $? -eq 0 ]]; then

  if [[ "${@}" == *"-q"* ]]; then
    ${job} spinoza_sinusfrommni
    exit 1
  fi

  echo "---------------------------------------------------------------------------------------------------"
  echo "[${module}] Create sagittal sinus mask using the MNI-mask & T1w/T2w-ratio"

  if [[ -d ${DIR_DATA_DERIV}/pymp2rage ]]; then
    INPUT_DIR=${DIR_DATA_DERIV}/pymp2rage
  else
    INPUT_DIR=${DIR_DATA_HOME}
  fi

  ${job} spinoza_sinusfrommni ${SUBJECT} ${SESSION} ${OW} ${INPUT_DIR} ${DIR_DATA_DERIV}/ants ${DIR_DATA_DERIV}/manual_masks

  if [[ $? -ne 0 ]]; then
    echo "ERROR in `basename ${0}`: spinoza_sinusfrommni exited with non-zero status"
    exit 1
  fi

  echo "[${module}] DONE" && echo

fi

spinoza_averageanatomies:

  t2_newspace=`find ${input_dir} -type f \( -name "*acq-3DTSE*" -and -name "*space*" \) 2>/dev/null`
  if [ ! -z ${t2_newspace} ]; then
    cp ${t2_newspace} ${output_dir}/${base}_acq-${DATA}_T2.nii.gz 2>/dev/null
  fi

spinoza_brainextraction:

            echo "CAT12 was selected, using following parameters:"
            echo " -input  = `basename ${input}`"
            echo " -output = `basename ${brainmask}`"
            
            call_cat12 ${mode} -s ${SPM_PATH} ${input} ${BET}

spinoza_registration:

    moving=`find "${input_dir}" -type f \( -iname "*${ACQ[1]^^}_*" -and -name "*.nii.gz" -and -not -name "*space-*" \) 2>/dev/null`

spinoza_sinusfrommni:

            call_t12ratio --t1 ${t1} --t2 ${t2} -o ${pn_anat}/${base}_acq-${DATA^^}

opened by carlottasabate 2

different affines for spm and fsl masks - module 04

Hi,

When running module 4 on the MP2RAGE of one of my participants I get the following error

Combining SPM & FSL mask to include cerebellum too
Traceback (most recent call last):
  File "/Users/verissimo/linescanning/bin/call_mp2rage", line 274, in <module>
    main(sys.argv[1:])
  File "/Users/verissimo/linescanning/bin/call_mp2rage", line 258, in main
    new_mask = image.math_img('(spm + fsl) > 0', spm=brainmask, fsl=opj(inputdir, 'tmp_mask.nii.gz'))
  File "/Users/verissimo/.local/lib/python3.7/site-packages/nilearn/image/image.py", line 907, in math_img
    _check_same_fov(*niimgs, raise_error=True)
  File "/Users/verissimo/.local/lib/python3.7/site-packages/nilearn/_utils/niimg_conversions.py", line 67, in _check_same_fov
    for e in errors]))
ValueError: ("Input images cannot be compared, you provided 'dict_values(['/Volumes/Verissimo_Seagate_Hub_Plus/Projects/FAM/DATA/anat_preprocessing/derivatives/pymp2rage/sub-011/ses-1/sub-011_ses-1_acq-MP2RAGE_desc-spm_mask.nii.gz', '/Volumes/Verissimo_Seagate_Hub_Plus/Projects/FAM/DATA/anat_preprocessing/sub-011/ses-1/tmp_mask.nii.gz'])',", 'Following field of view errors were detected:\n- img_#0 and img_#1 do not have the same affine')

When checking, indeed their affine is slighty different (probably due to some rounding error) img_1.affine

array([[ 6.99586272e-01, -2.39259005e-02,  0.00000000e+00,
        -8.79358139e+01],
       [ 2.40633488e-02,  6.95588470e-01,  4.85897064e-04,
        -1.21713348e+02],
       [-1.68085098e-05, -4.85658646e-04,  6.95999801e-01,
        -1.34004532e+02],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.00000000e+00]])

img_2affine

array([[ 6.99586272e-01, -2.39258446e-02,  0.00000000e+00,
        -8.79358139e+01],
       [ 2.40633450e-02,  6.95588470e-01,  4.85899596e-04,
        -1.21713348e+02],
       [-1.67993858e-05, -4.85612400e-04,  6.95999801e-01,
        -1.34004532e+02],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.00000000e+00]])

I didn't have this before, so am wondering why this only happened now... And which of the affines is the one that should be used throughout the pipeline?

opened by iverissimo 2

input file name in call_stats2surf

"nii.gz" is by default part of the find-command, so if you input a file with nii.gz at the end it will look for a file with nii.gz.nii.gz

could be fixed via sed 's,.nii.gz.nii.gz,.nii.gz,' or if loop

opened by hlesaint 2
One class to rule them all; let ParseFuncFile inherit from ParseExpToolsFile

Now I have separate classes for formatting the functional data (ParseFuncFile), experimental data (ParseExpToolsFile), eyetracker data (ParseEyetrackerFile), and physio files (ParsePhysioFile). Might be good to have ParseFuncFile as base class, and have it inherit the other classes.

Right now, ParseExpToolsFile already inherits ParseEyetrackerFile, so we'd only have to include the former into ParseFuncFile. I envision the inputs to be solely list of files corresponding to fMRI, exptools, eyetracker, and physio (maybe even just BIDS-like directory).

And maybe split these classes into a separate dataset.py-module rather than clogging up utils.py

opened by gjheij 2

melodic axis of power spectra

can you change the limit of the x-axis in the power spectra here: /data1/projects/MicroFunc/Luisa/programs/linescanning/linescanning/preproc.py to

            # get frequency/power
            freq = get_freq(tc, TR=self.TR, spectrum_type="fft")

            plotting.LazyPlot(
                freq[1],
                xx=freq[0],
                color=color,
                x_label="frequency (Hz)",
                y_label="power (a.u.)",
                title="power spectra",
                axs=ax_freq,
                line_width=2,
                x_lim=[0,1/(2*self.TR)],
                **kwargs)

So that we actually have the maximum frequency available Thanks :)

opened by luisarai 1

missing warp-files for call_antsapplytransforms

When running spinoza_sinusfrommni got the missing file error even though the files are in the the ants folder. Noticed that lines 172 and 173 expected MNI152Nlin6Asym and files created in module 5 are called MNI152NLin6Asym (so capital letter). I guess it's related to the version of ANTs used, but have not checked to be sure

opened by iverissimo 1
Check if bias correction of T2 improves pial recon

I just noticed that the T2's tend to have dropout right in the area we want it to properly work (i.e., visual areas). I'll check if bias correction prior to FreeSurfer make recon even better

opened by gjheij 1
add image orientation check
Realized that MP2RAGE I had was in PSR orientation (336x336x265) and code expects image to be RAS (265x336x336). This is only caught in module 13 in call_gdhmasking - when calculating 'outside' mask (line 248) from dilated dura mask and sinus drawn in ITKsnap, they will have different affines... Because I noticed this, I reoriented the two inversions niftis with a quick script

orig_img = nb.load(input_file) canonical_img = nb.as_closest_canonical(orig_img) nb.save(canonical_img, input_file.replace('.nii.gz','RAS_.nii.gz'))

and re-ran the pipeline. Then all was fine. Would be better to have some orientation check in the first modules (maybe in spinoza_qmrimaps?) to avoid this. Can be done by re-orienting the images in the pipeline, or throwing an error so the user knows the orientation should be RAS.
opened by iverissimo 1
matlab alias fix in call_bashhelper

When calling matlab, current code assumes that there is an alias command matlab. Because I switched from zshell to bash, alias are not inherited and calling matlab would return command not found error message. My fix for this was to replace it with $MATLAB_DIR: $MATLAB_DIR -nosplash -nodisplay -batch "addpath('$(dirname ${1})'); $(basename ${1} .m)" | tail -n +${skip_lines} # suppress intro text This worked, not sure if best fix

opened by iverissimo 1

custom stimulus duration input

opened by luisarai 0

Add `shift` parameter to call_createimg

Sometimes we need to move the slices up/down depending on how the saturation slabs and B1 behave. This messes the beam-image up, so there needs to be an argument that says how much shift there was so we can correctly estimate the position of the line

opened by gjheij 0

Linescanning - Package for (pre)processing of anatomical and (linescanning) fMRI data

Related tags

Overview

line scanning repository

In active development - do not use unless otherwise instructed by repo owners

Policy & To Do

overview of the pipeline

how to set up

How to plan the line

Comments

Owner

Jurjen Heij

The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

Pre-Trained Image Processing Transformer (IPT)

Data Preparation, Processing, and Visualization for MoVi Data

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

TorchGeo is a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.

Data manipulation and transformation for audio signal processing, powered by PyTorch

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Apache Spark - A unified analytics engine for large-scale data processing

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Automatically download the cwru data set, and then divide it into training data set and test data set

QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place — so you can focus on building the next big thing.

Simple Python project using Opencv and datetime package to recognise faces and log attendance data in a csv file.

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".