dna-parser

dna-parser is a Python library written in Rust to encode (or perform feature extraction on) DNA/RNA sequences for machine learning.

The source code is available on GitHub

Installation

To install dna-parser run:

pip install dna-parser

If there is no Python wheel available for your OS, you can install Rust and re-install dna-parser which should now compile on your machine. Run the following command on Unix-like OS to install Rust:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

or see more options at https://www.rust-lang.org/tools/install.

Quick Start

import dna_parser as dps

sequences= ["agt","acc"]
encodings= dps.onehot_encoding(sequences)
print(encodings)

# Output:
#[[[0 0 1 0]
#  [0 1 0 0]
#  [0 0 0 1]]

# [[0 0 1 0]
#  [1 0 0 0]
#  [1 0 0 0]]]

All encodings with examples are available in the Documentation section.