Neuroimaging Data Access
All data is collected and provided for access and conversion into BIDS on rolando.cns.dartmouth.edu.
You need to request access permissions from Andrew Connolly.
All dicoms are organized into
YEAR/MONTH/DATE/ACCESSION hierarchy under
rsync them to your local storage.
At the moment, upon request from a lab member to Yaroslav Halchenko,
data is converted from DICOMs into BIDS within the directories hierarchy under
convention described in ReproIn section (TODO: make into a reference).
These directories are also DataLad datasets, so you have two options on how to transfer them:
Make sure you have git configured, in that it knows about you.
If you have output to running
git config user.name command, then most likely you are all set.
If it comes out empty, please do
git config --global user.name "First Last" git config --global user.email "someemail@address"
while replacing those values with your name and email.
If you are doing it to discovery HPC, please first checkout the next section of this documentation below (Discovery filesystem) on how to configure your git for Discovery's ACL filesystem, unless you know that it is pure POSIX.
Then it is recommended to create a directory for your study first, e.g.
mkdir ID_name where
ID_name are the same as on rolando, e.g.
cd into it, e.g.:
mkdir 0001_dbic-animals cd 0001_dbic-animals
Use datalad install command
to obtain dataset, and then datalad get to
obtain specific files. If you are greedy, add
-r to get full hierarchy of datasets, and/or
to immediately also fetch all data files
datalad install -s firstname.lastname@example.org:/inbox/BIDS/dbic/0001_dbic-animals dbic cd dbic datalad get -J4 sub-* # to get only converted data, without tarballs etc
Later upgrades to fetch new data (subjects etc) could be done via
datalad update --how merge -r
Unfortunately the filesystem used on discovery by default does not support smooth git-annex and thus DataLad operation.
If you use
datalad install or
datalad clone as instructed above, you would likely to endup in "adjusted" git-annex branch which would complicate your interactions with the data, etc.
We recommend to use new feature of git-annex allowing for custom protection of data on discovery.
Step 1: make sure you are using recent git-annex
Make sure that you are using recent (at least as of January 2023) version of git-annex.
For that you could use the version we provide and just adjust your
~/.bashrc with the following content:
ANNEX_BIN_PATH=/dartfs/rc/lab/D/DBIC/DBIC/archive/git-annex/usr/lib/git-annex.linux/ echo $PATH | grep -q "$ANNEX_BIN_PATH" || export PATH="$ANNEX_BIN_PATH:$PATH"
So whenever you re-login (or open a new
bash) and type
git annex version you should get version past above date.
Step 2: configure git-annex to use custom data protection
Adjust you global
~/.gitconfig with the following section
[annex] thawcontent-command = /dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/thaw-content %path freezecontent-command = /dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/freeze-content %path
which also could be done via running commands
git config --global annex.thawcontent-command '/dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/thaw-content %path' git config --global annex.freezecontent-command '/dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/freeze-content %path'
Step 3: make sure that directory has group ACL to remove children
It is the
D ACE Permission: if folder lacks it, then
git-annex will be unable to move read-only file under
So, if you get a "Permission error" while trying to
git annex add or
datalad save, you might need to add that to the group permissions.
/dartfs/rc/lab/D/DBIC/DBIC/archive/bin-annex/fix-dir-group-perm script with the folder under which you want to create/clone repo to add that
Now, after these 3 steps, whenever you
datalad install data from rolando you should end up in
If that doesn't happen - file an issue.
Parallel get - multiple passwords
If you are
geting data to discovery, to non-POSIX compliant filesystem, then you must provide
datalad get to prevent parallel downloads and multiple password prompts.
Reckless clone still wants to access rolando
TODO: Yarik figureout
(conda-20200210-datalad) [d31548v@discovery7 tmp]$ datalad install -s 1021_actions --reckless auto 1021_actions_reckless [INFO ] Fetching updates for <Dataset path=/dartfs/rc/lab/D/DBIC/DBIC/tmp/1021_actions_reckless> Dartmouth College, Department of Psychological and Brain Sciences Authorized access only email@example.com's password:
Old fashion way
rsync. But you would need to take care about dereferencing symlinks.
rsync --exclude=.git --copy-links -r \ rolando.cns.dartmouth.edu:/inbox/BIDS/dbic/dbic-animals dbic-animals
You could add
--exclude=derivatives to exclude
folders with original DICOMS and possible derivatives (fmriqc, etc).