P-arch - Digital Repository

Scripting for large-scale sequencing based on Hadoop

Mostra i principali dati

dc.contributor.author Schumacher, André
dc.contributor.author Pireddu, Luca
dc.contributor.author Kallio, Aleksi
dc.contributor.author Niemenmaa, Matti
dc.contributor.author Korpelainen, Eija
dc.contributor.author Zanetti, Gianluigi
dc.contributor.author Heljanko, Keijo
dc.date.accessioned 2014-05-16T08:03:04Z
dc.date.available 2014-05-16T08:03:04Z
dc.date.issued 2013
dc.identifier.issn 2226-6089
dc.identifier.uri http://hdl.handle.net/11050/909
dc.description.abstract The large volumes of data generated by modern sequencing experiments present significant challenges in their manipulation and analysis. Traditional approaches are often found to be complicated to scale. We describe our ongoing work on SeqPig, a tool that facilitates the use of the Pig Latin distributed scripting language to manipulate, analyze and query sequencing data applying the advances motivated by the “big data revolution” in data-intensive activities. SeqPig provides access to popular data formats and implements a number of custom sequencing-specific functions. Most importantly, it grants users access to the scalable Hadoop platform from a high level scripting language IT
dc.language.iso en IT
dc.relation.ispartof EMBnet.journal. The Next NGS Challenge Conference: Data Processing and Integration 14-16 May 2013, Valencia, Spain IT
dc.relation.ispartofseries 19;Suppl. A
dc.rights Attribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia *
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/it/ *
dc.subject bioinformatics IT
dc.subject ngs IT
dc.subject data analysis IT
dc.subject cloud computing IT
dc.subject high-performance computing IT
dc.title Scripting for large-scale sequencing based on Hadoop IT
dc.type Articolo IT
dc.description.pagenumber 84-85 IT
dc.description.status Pubblicato IT
dc.identifier.doi 10.14806/ej.19.A.628 IT
dc.subject.een-cordis EEN CORDIS::SCIENZE BIOLOGICHE ::Ricerca sul genoma ::Bioinformatica IT
dc.subject.program Program::Biomedicine::Bioinformatics (BI) IT

File allegati

I seguenti file di Licenza sono associati a questo inserimento:

Questo inserimento fa parte delle seguenti collezioni

Mostra i principali dati

Attribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia Attribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia