📚 User Guidance

Complete guide to using scHNDB database and downloading data

Overview

Welcome to the scHNDB (Single-cell and Spatial Transcriptomic RNA-seq Database of Head and Neck Squamous Cell Carcinoma) user guidance. This guide will help you navigate the database, search for data, and download datasets for your research.

scHNDB provides comprehensive single-cell RNA-seq and spatial transcriptomics data from head and neck cancer research. The database is organized into multiple sections to facilitate easy access to data based on your research needs.

The website is a static portal: pages are pre-built; interactive charts and search tools run in your browser and load supporting JSON or images from the site. You need a modern browser and a stable network connection. Large matrix downloads are hosted externally on Zenodo (open access); follow the Zenodo links on each Resource page.

Database Structure

scHNDB is organized into the following main areas:

1. Search

The Search section provides interactive exploration and visualization. It includes these sub-pages:

  • Dataset Information: Browse sample-level metadata (demographics, staging, tissue, virus status, etc.), filter the table, and view summary charts (static or interactive).
  • Celltype Annotation: Explore cell type annotations and UMAP-style views across integrated data.
  • Malignant Trajectory: Browse pseudotime-related figures and resources for malignant cell trajectories (including cell-lineage subpages where available).
  • Treatment Target: Explore treatment-related targets and supporting views derived from single-cell analysis.
  • Spatial Interaction: Access spatial deconvolution–related views and cell–cell interaction (CCI) resources where provided.

2. Resource

The Resource section is the main entry for downloading processed 10X-style packages and related files. From the Resource landing page you can also open the combined 10X Zenodo package (when applicable). Subsections:

  • Dataset: Per-sample or per-study entries with descriptions and download links to raw/processed packages.
  • Tissue Type: Data aggregated by tissue source (e.g., Tumor, Normal, OPMD, PBMC, Lymph node, Cell line).
  • Cell Type: Data aggregated by major cell populations: Epithelial, Fibroblast, Endothelial, Myeloid, NK, B/Plasma, and T cells—each has its own detail page.
  • Treatment: Treatment-stratified or treatment-associated processed resources, where available.

Typical downloads include a 10X raw package (ZIP) plus separate barcodes, features, matrix, and metadata files where listed. Exact filenames are shown on each page.

3. Guidance

This page—usage orientation, download workflow, and analysis tips.

4. About Us

Project information, team, and contact details.

How to Download Data

Understanding Data Files

Each dataset or category in the Resource section provides four types of files:

  1. Barcode file (barcodes.tsv.gz): Contains unique cell barcode identifiers
  2. Feature file (features.tsv.gz): Contains gene names and annotations
  3. Matrix file (matrix.mtx.gz): Contains the gene expression matrix in sparse MTX format
  4. Metadata file (metadata.csv.gz): Contains cell-level annotations including cell types, sample information, and quality metrics

Download by Dataset

1 Navigate to Resource → Dataset

2 Browse the available datasets and read their descriptions

3 Click on a dataset card to view detailed information

4 Download the required files (barcode, feature, matrix, and metadata)

5 Extract the compressed files to your local directory

đź’ˇ Recommended: Download all four files for each dataset to ensure you have complete information for analysis.

Download by Tissue Type

If you're interested in specific tissue types:

1 Navigate to Resource → Tissue Type

2 Select your tissue type of interest (e.g., Tumor, Normal, PBMC)

3 Each tissue type page shows the number of cells and samples available

4 Download the integrated data files for that tissue type

Download by Cell Type

If you're focusing on specific cell populations:

1 Navigate to Resource → Cell Type

2 Choose your cell type of interest (e.g., T Cell, Myeloid Cell)

3 Review the subtypes included in each category

4 Download the processed data for your cell type of interest. The B/Plasma cell type has a dedicated page alongside T, NK, Myeloid, Epithelial, Endothelial, and Fibroblast.

⚠️ Note: Ensure you have sufficient storage space before downloading. Matrix files can be large (several hundred MB to GB).

Combined 10X package

On the main Resource landing page, use the highlighted link to the combined 10X Zenodo record when you need a single entry point for a full combined matrix and metadata package (see on-page title and description).

Marker Gene Tables

Marker tables (typically .xls) and, for cell types, published reference gene sets (.zip on Zenodo), are not bundled inside the 10X ZIP on the Resource pages. They are published as separate open datasets:

đź’ˇ Tip: Always use the Zenodo landing page linked from the Resource subpage so you pick the correct file name and version for your tissue or cell type.

Data Analysis

Loading Data in Seurat (R)

Once you've downloaded the data files, you can load them into Seurat for analysis:

# Load required library

library(Seurat)

Read 10X format data

data <- Read10X(data.dir = “path/to/downloaded/files”)

Create Seurat object

seurat_obj <- CreateSeuratObject(counts = data, project = “scHNDB”)

Load metadata

metadata <- read.csv(“path/to/metadata.csv.gz”)
seurat_obj <- AddMetaData(seurat_obj, metadata)

Loading Data in Scanpy (Python)

For Python users, use Scanpy to load the data:

# Load required library
import scanpy as sc
import pandas as pd

Read 10X format data

adata = sc.read_10x_mtx(‘path/to/downloaded/files’)

Load metadata

metadata = pd.read_csv(‘path/to/metadata.csv.gz’)
adata.obs = metadata

Understanding Metadata

The metadata file contains important information about each cell:

  • Cell barcode: Unique identifier for each cell
  • Sample ID: Which sample/patient the cell comes from
  • Cell type: Annotated cell type (e.g., T cell, Epithelial cell)
  • Tissue type: Source tissue (e.g., Tumor, Normal)
  • Quality metrics: nCount_RNA, nFeature_RNA, percent.mt, etc.
  • Additional annotations: May include subtype information, clustering results, etc.

Best Practices

Before Downloading

  1. Explore the data first: Use the Search section to understand the available datasets and their characteristics
  2. Check sample sizes: Review the number of cells and samples to ensure they meet your research needs
  3. Read descriptions: Understand what each dataset contains before downloading
  4. Plan storage: Ensure you have adequate disk space for large files

During Analysis

  1. Quality control: Always perform quality control on downloaded data, even though it’s pre-processed
  2. Batch effects: Be aware of potential batch effects when combining multiple datasets
  3. Metadata utilization: Make full use of the provided metadata for your analysis
  4. Normalization: Consider re-normalizing data if combining with your own datasets

Citation

When using scHNDB data in your research, please cite the scHNDB / Peng Lab publication when available, and also cite the Zenodo DOI for any downloaded package (combined 10X, tissue-level, cell-type, or marker-gene deposit) so file version and access date are clear.

[Insert full paper citation and Zenodo DOIs as finalized by the project.]

Troubleshooting

Common Issues

Q: The downloaded files won’t extract

  • A: Ensure you have the appropriate decompression software (e.g., gzip, 7-Zip)
  • Try using command line: gunzip filename.gz

Q: Data won’t load in Seurat/Scanpy

  • A: Verify all three files (barcodes, features, matrix) are in the same directory
  • Check that file names match the expected format
  • Ensure files are properly decompressed

Q: Metadata doesn’t match the expression matrix

  • A: Verify you downloaded all files from the same dataset/category
  • Check that cell barcodes in metadata match those in the barcode file

Q: Download links are not working

  • A: Most packages are hosted on Zenodo; ensure your network allows access to zenodo.org (and check institutional firewalls).
  • Open the Zenodo page in a new tab and download manually if the in-page button is blocked by the browser.
  • For mirror options (e.g., China OSS), follow updates on the Resource page when available.
  • Contact us if the issue persists: cainzhi@foxmail.com

Q: Interactive Search page is slow or images missing

  • A: Clear cache, try another browser, and check that scripts and JSON paths are not blocked. Large manifests may take a few seconds on first load.

Getting Help

If you encounter issues not covered in this guide:

  • Check our FAQ section (if available)
  • Contact us via email: cainzhi@foxmail.com
  • Report issues on our GitHub repository

Updates and Maintenance

scHNDB is regularly updated with new datasets and features. Check back periodically for:

  • New sample datasets
  • Additional cell type annotations
  • Enhanced visualization tools
  • Updated analysis pipelines
đź“… Last Updated: April 2026