(bio)-informatics, data processing and visualization

Friday, July 30, 2010

virtual splicing

regular expressions to remove surrounded/enclosed lowercase characters by uppercase letters (virtual splicing), for example:
perl -p -i.1 -e 's/(?<=[A-Z])[a-z]*(?=[A-Z])//g' example.txt
or
perl -p -i.1 -e 's/(?<=[A-Z])[a-z]*(?=[A-Z])//g unless /^>/' example.txt
(in the case of file with FASTA header)

will transform string
atgcATGCcgtaACGTtgcaCGTAcgta
to
atgcATGCACGTCGTAcgta

(solution suggested by Leah McHale https://pro.osu.edu/profiles/mchale.21/)