How to...
-
How to get the path for the folder containing a bash script within the bash script:
# Get the directory where the script resides SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" echo $SCRIPTDIR
-
How to convert a minimap2 alignment to gff/gtf:
https://github.com/lh3/minimap2/issues/455
https://github.com/lh3/minimap2/files/9591008/bam2gff_fixGffread.zip
bam2gff_fixGffread.zipminimap2 -t 10 -ax splice:hq -uf ref.fa cdna.fa |/Bio/bin/samtools-1.14 view -b > minimap2.tr.bam perl bam2gff.pl -b minimap2.tr.bam -o minimap2.tr.gff -s /Bio/bin/samtools-1.14 gffread minimap2.tr.gff -T -o minimap2.tr.gtf perl fixGffread.pl -i minimap2.tr.gtf -o minimap2.tr.fix.gtf
Alternative method:
#Align sequences and convert to BAM minimap2 -ax splice --cs target.fa query.fa | samtools sort -O BAM - > alignments.bam #Convert to BED12 using BEDtools bedtools bamtobed -bed12 -i alignments.bam > alignments.bed #Convert to genePred using UCSC tools bedToGenePred alignments.bed alignments.genepred #Convert to GTF2 using UCSC tools. genePredToGtf has additional options that might be useful in specific use cases. genePredToGtf "file" alignments.genepred alignments.gtf
-
How to do a dotplot with minimap2:
minimap2 -DP ref.fa query.fa|miniasm/minidot - > dot.eps
-
How to interpret genome dot plots (#dotplots #genomic).
-
How to clone a public GitHub repository with VS Code and push it to a private GitHub repository.
- Make sure Git is installed
- Open VS Code and use the source control icon on the far left to clone a git repository to a local folder
Open a terminal in VS Code (View>terminal)
PS C:\Users\github> cd sarek PS C:\Users\github\sarek> git remote remove origin PS C:\Users\github\sarek> git remote add origin https://github.com/ink-blot/sarek.git PS C:\Users\github\sarek> git branch * master PS C:\Users\github\sarek> git push -u origin master
If it doesn't promptly start pushing, an authorisation screen should (eventually) appear (it may take a few minutes).
The token method is preferred:
- Go to your GitHub account settings: GitHub Token Settings.
- Click Generate new token (classic).
- Select the scopes you need (e.g., repo for private repositories).
- Generate the token and copy it (you won’t be able to see it again later).
- In the GitHub sign-in window, switch to the Token tab.
- Paste the generated token into the input field and confirm.
Optional steps if you want to fetch updates from the original nf-core/sarek repository in the future, add it as an upstream remote:
PS C:\Users\github\sarek> git remote add upstream https://github.com/nf-core/sarek.git PS C:\Users\github\sarek> git fetch upstream PS C:\Users\github\sarek> git merge upstream/main
-
How to get rid of "WARNING : No mitochondrion chromosome found" in SnpEff:
Prefix the contig name with MT.
-
How to collect all files from one channel and associate/combine them with elements of another channel in NextFlow:
Example Input channels:
bam_for_collect_ch2:
[ file('73-50_L002.bam') ]
[ file('73-50_L001.bam') ]interval_vcfs_3:
[ file('73-50_L001_raw_variants_1.vcf.gz') ]
[ file('73-50_L001_raw_variants_2.vcf.gz') ]
[ file('73-50_L002_raw_variants_1.vcf.gz') ]
[ file('73-50_L002_raw_variants_2.vcf.gz') ]Process code:
input: set val(pair_id), val(all_vcf) from bam_for_collect_ch2.map({ file -> file.baseName }).combine(interval_vcfs_3.collect().map({ file -> file.baseName }).toList())
Example output:
[ '73-50_L002', ['73-50_L002_raw_variants_1.vcf.gz', '73-50_L002_raw_variants_2.vcf.gz'] ]
[ '73-50_L001', ['73-50_L001_raw_variants_1.vcf.gz', '73-50_L001_raw_variants_2.vcf.gz'] ]
-
How to collect all files related by prefix in NextFlow:
Example input channel:
interval_bams_ch:
[ file('73-50_L001_raw_variants_1.bam') ]
[ file('73-50_L001_raw_variants_2.bam') ]
[ file('73-50_L002_raw_variants_1.bam') ]
[ file('73-50_L002_raw_variants_2.bam') ]Process code:
bam_name_parts_ch = interval_bams_ch.map { file -> def name = file.baseName.replaceFirst(/_raw_variants_.*/, '') tuple(name, file) }.groupTuple()
Example output:
[ '73-50_L001', [file('73-50_L001_raw_variants_1.bam'), file('73-50_L001_raw_variants_2.bam')] ]
[ '73-50_L002', [file('73-50_L002_raw_variants_1.bam'), file('73-50_L002_raw_variants_2.bam')] ]
-
How to conditionally choose from two channels in NextFlow:
grouped_interval_vcf_ch=(params.splitIntervalOverlapLength && params.splitIntervalOverlapLength.toInteger() > 0 ? trimmed_vcf_ch : interval_vcfs_3 )
-
How to re-use a file channel and indeterminate number of times (eg combining with a channle with an unknkown number of elements in NextFlow DSL1:
process collectGVCF { publishDir "${params.combined_1_vcf}", mode: 'copy' output: set val(pair_id), val(round), file("${pair_id}_raw_variants_${round}.vcf.gz") into collected_vcf } // Later after splitting: sample_files .map { it -> it.getBaseName() } .combine(Channel.fromPath("${params.combined_1_vcf}/*_raw_variants_1.vcf.gz")) .set { reuse_pairs }