Release 2025.02.R1
Release posted: March 10, 2025
Fixes
Clinical metadata updates
It was recently brought to our attention that 17 RA/SLE subjects diagnoses had been mislabeled in AMP-RA.SLE_clinical.csv. Please use syn43215073.4 to ensure the most accurate information. We have updated this and all corresponding clinical metadata files in the portal to reflect the following:
- 15 subjects changed from RA to OA
- 2 subjects changed from ‘control’ to SLE
The valid values within these files were also updated to conform to the current version of the ARK Portal data model.
Dataset annotation updates
Metadata describing all 21 datasets have been updated to conform to the current version of the ARK Portal data model. This includes standardized and harmonized labels and resolves the outstanding issue related to assay
annotations noted in release 2024.08.R1. Additional metadata fields added to datasets include: associatedCodeURL, dbGapAccession, associatedDataset, publicationSynID, and libraryPrepMethod
.
Release 2024.12.R1
Release posted: January 9, 2024
AMP RA/SLE Datasets
- This publication dataset integrated CyTOF data from the AMP-RA.SLE_PhaseII_CyTOF corresponding to 167 subjects and from the AMP RA.SLE Phase II PBMC CITE-seq corresponding to 140 subjects.
- By profiling and integrating PBMC gene and protein expression patterns the authors systematically identified activated lymphocyte phenotypes in RA At-Risk individuals, along with immunophenotypic differences between different At-Risk subpopulations.
- In addition to the corresponding files from noted datasets, this dataset also includes publication-specific processed data results in the form of Rds files including SeuratObjects and sparse matrices of protein and gene expression.
News
To facilitate easier data discovery and provide improved support for data reuse the ARK Portal management team at Sage Bionetworks has been redeveloping the ARK Portal data model. We will be updating annotations for all files and datasets, one dataset at a time. We thank you for your patience as we work to implement these changes. As updates continue we will include progress reports in these release announcements.
You can explore the ARK data model at https://ark-portal.github.io/data_model/ . The community is welcome to provide feedback and contributions at anytime.
Annotation Revisions Completed
- AMP RA.SLE Phase II PBMC CITE-seq
- SLE experiment protocol added.
- AMP-RA.SLE PhaseII CyTOF
Known Issues
Updating file and dataset annotations may cause unexpected issues to the ARK Portal site. Please notify our team through the help desk if you experience issues navigating the site. At this time the Explore All Data and Explore Datasets pages will show a mix of metadata columns and values. Once all dataset have been updated these pages will be revised to surface only the columns and values defined in the new ARK data model.
Release 2024.10.R1
Release posted: November 25, 2024
AMP RA/SLE Datasets
- AMP RA.SLE Phase II PBMC CITE-seq
- CITE-seq experiment profiling PBMCs in two highly multiplexed experiments targeting either SLE and control cases or RA, RA at-risk, and control cases. The SLE experiment profiled 248 biospecimen corresponding to 149 and the RA experiment profiled 198 biospecimen corresponding to 150 individuals.
- Surface protein profiling used the BioLegend TotalSeq™-A Human Universal Cocktail v1 targeting 163 proteins.
- Dataset includes 10x Genomics Chromium scRNA-seq and feature barcode sequencing library fastq files, sample and library metadata, and several Rds files of gene and feature barcode counts.
- AMP SLE Phase II Urine scRNA-seq
- Multiplexed 10x Genomics scRNA-seq profiling cells collected from 338 urine samples from 155 SLE cases.
- Dataset includes raw fastq files, Cell Ranger MEX output, demuxlet output, and a Seurat Object of combined, demultiplexed cell gene counts.
Release 2024.09.R1
Release posted: October 21, 2024
AMP RA/SLE Datasets
- AMP RA.SLE Genomic Variants v2
- AMP RA/SLE subject SNP genotyping by Illumina Infinium Multi-Ethnic Global BeadChip. These results are derived from an improved processing of the same data used to process the results shared in the AMP RA/SLE Genomic Variants v1 dataset included in the ARK 1.0 release. Data was processed to remove duplicated samples and genetically duplicated specimens.
Release 2024.08.R1
Release posted: September 6, 2024
AMP RA/SLE Datasets
- AMP-RA Phase I Synovium CyTOF
- AMP RA Phase I CyTOF of 28 dissociated synovium samples, collected either as biopsies or arthroplasties, from RA and OA cases profiling a 35 parameter synovial marker panel.
- AMP-SLE LN class II kidney scRNA-seq
- AMP SLE Phase II kidney 10x Genomics scRNA-seq dataset generated from 12 class II lupus nephritis (SLE/LN) and 5 healthy control cases (Ctrl) from the AMP RA/SLE METRO team. Only raw fastq files have been included in this dataset at this time. Processed data will be made available in a subsequent ARK Portal release.
Fixes
- Prior to this release, some files when downloaded will have been saved with a name that differs from what is displayed in the portal. This issue is now resolved such that downloaded file names will match what users see in ARK. Below is a table to help users map previously downloaded files to the correct and most up-to-date file names.
markerPanel
annotation has replacedcellType
for all FCS files in AMP-RA.SLE_PhaseII_CyTOF.
Known Issues
We identified a formatting error with the Dataset Collections assay
label. Faceted search can still be used to identify and select datasets of interest. This will be fixed in an upcoming release.
Release 2024.07.R1
Release posted: July 31, 2024
AMP RA/SLE Datasets
- AMP-RA.SLE_PhaseI_CyTOF
- AMP RA/SLE Phase I single cell CyTOF data on 79 PBMC samples profiling three panels and 47 total leukocyte samples profiling two panels.
- This dataset includes the PBMC CyTOF data in the AMP-RA.SLE_PBMCCyTOF dataset originally included in Release 1.0. The AMP-RA.SLE_PBMCCyTOF dataset is now deprecated and will be removed in a subsequent release.
- AMP-RA_Synovium_Low-input_Bulk_RNA-seq
- AMP RA Phase I synovium low-input bulk RNA-seq data as described in Defining inflammatory cell states in rheumatoid arthritis joint synovial tissues by integrating single-cell transcriptomics and mass cytometry. Includes samples from 40 rheumatoid arthritis (RA) and 14 osteoarthritis (OA) patients where synovial tissue was collected by either arthroplasty or biopsy. Synovial cells were dissociated and FACS used to collect populations of B cells, T cells, fibroblasts, and monocytes for each sample, generating 192 bulk RNA-seq libraries.
- Dataset includes raw fastq files, TPM gene counts, and sample and library metadata.
Fixes
- Relative to AMP-RA.SLE_PBMCCyTOF, file annotations, and where applicable, filenames, are now updated for all AMP RA/SLE Phase I PBMC CyTOF files in AMP-RA.SLE_PhaseI_CyTOF.
- This includes the addition of
markerPanel
as an annotation to replacecellType
for Phase I CyTOF FCS files. A similar update to Phase II CyTOF FCS files is scheduled for a subsequent release.
- This includes the addition of
- AMP-RA.SLE_clinical.csv has been updated to accurately label osteoarthritis (OA) patients which had previously been labeled as RA. Diagnosis label was corrected for additional individuals as well. In total 24 diagnosis labels have been revised. All ARK datasets have been updated to include this new version of data (syn47136972.2). Some file names and file annotations have been updated to reflect diagnosis label corrections, however an additional review will be performed to further validate that all file names and annotations are up-to-date in a future release.
Data Release 2024.06.R1
Release posted: July 9, 2024
Starting with this release, release version names will follow a new convention that includes the year and month in which the release occurred: YYYY.MM.R#, where R# is a unique release tag to distinguish between releases in the same YYYY.MM.
Publication Datasets
- The chromatin landscape of pathogenic transcriptional cell states in rheumatoid arthritis
- This publication dataset includes 10x Genomics unimodal scATAC-seq raw fastqs, Cellranger output, and processed data of synovial biopsies from 14 RA and 4 OA patients from AMP RA Phase II.
- Processed data is also available for additional synovial tissue samples from 11 RA patients and 1 OA patient.
- Additional public data, not part of AMP RA/SLE, were used to generate some of the findings reported in this publication. Please see the publication for more details.
AMP RA/SLE Datasets
- AMP-SLE_Kiloplex_Proteomics_Urine
- Both Phase I and Phase II data released.
- Phase I data include abundances for 1000 protein targets profiled in urine collected from lupus nephritis (LN) patients and seven healthy controls (HC). A total of 36 unique LN patients were profiled at one or more time points starting at the time of the diagnostic renal biopsy and 3, 6, and 12 months after. A total of 21-30 LN samples are available at any one time point.
- Phase II data include abundances for 1200 protein targets profiled in urine from 226 systemic lupus erythematosus (SLE) patients with one or more sample time point and 10 healthy control samples.
- AMP-RA.SLE_PhaseII_CyTOF
- AMP RA/SLE Phase II single cell CyTOF data on PBMC samples profiling four panels and total leukocyte samples profiling one panel.
- Includes data for 323 TL samples and 452 PBMC samples collected in multiplexed, randomized batches. Both raw and ungated, debarcoded data available in FCS files.
Known Issues
- AMP-SLE_Kiloplex_Proteomics_Urine
- The Phase II clinical data in
Variable dictionary.xlsx
andph2_urine_proteomics_basicclinicodemographic.txt
referencesprcr
as “Urine protein to creatinine ratio, numeric”. However the publication refers to this metric asUPCR
.
- The Phase II clinical data in
Data Release 2.0
Release posted: June 6, 2024
Publication Datasets
- Deconstruction of rheumatoid arthritis synovium defines inflammatory subtypes
- This publication dataset includes multimodal, single cell CITE-seq data that is new to ARK corresponding to 82 synovial tissue samples from RA patients exhibiting moderate to high disease activity.
- Additional AMP RA/SLE datasets used in this publication include RA Phase I single-cell RNA-seq (AMP-RA CEL-Seq) and genomic variant data still pending release.
- Link to author’s data exploration and visualization site included in dataset description.
- Please note, additional public data, not part of AMP RA/SLE, were used to generate some of the findings reported in this publication.
- Clonal associations between lymphocyte subsets and functional states in rheumatoid arthritis synovium
- This publication used RA Phase II TCR and BCR single cell V(D)J sequencing data from synovium and PBMC samples corresponding to 12 subjects in the AMP-RA_scImmuneRepertoireSeq Dataset.
AMP RA/SLE Datasets
- SLE Phase II kidney scRNA Seq
- 10x Genomics Chromium single cell RNA-seq of 202 kidney samples. Includes raw fastq files, CellRanger output, and additional processed data as
.Rds
files for loading intoR
.
- 10x Genomics Chromium single cell RNA-seq of 202 kidney samples. Includes raw fastq files, CellRanger output, and additional processed data as
- SLE phase II kidney snRNA-seq
- 10x Genomics Chromium single nuclei isolation RNA-seq of 50 kidney samples. Includes raw fastq files, CellRanger output, and additional processed data as
.Rds
files for loading intoR
.
- 10x Genomics Chromium single nuclei isolation RNA-seq of 50 kidney samples. Includes raw fastq files, CellRanger output, and additional processed data as
Fixes
- AMP-SLE_C1-HT_mRNA-Seq
- A corrupted fastq file was identified and removed from the dataset along with the corresponding R1 fastq file
- specimenID information was added to dataset-specific metadata file for 46 specimenID missing from Release 1.0
Known Issues
- AMP-SLE_C1-HT_mRNA-Seq
- Conflicting specimenID and specimenType labels identified in metadata for six specimenID
Data Release 1.0
Release posted: December 14, 2022
This is the first data release for the AMP® RA/SLE program. It focuses mostly on PhaseI data, but is not the complete set. Additional PhaseI data will be provided in release 2.0. This release consists of the following content:
- Phenotypic data: For this first release this includes race, ethnicity, diagnosis, and age of recruitment for RA_PhaseI, RA_PhaseII, SLE_PhaseI, and SLE_PhaseII. The phenotypic data is provided as a csv file linked to all experimental data and manuscript datasets.
- Genotypes: SNP genotypes on RA_PhaseI, RA_PhaseII, SLE_PhaseI, and SLE_PhaseII participants. Provided are Plink files and imputed data.
- Immune repertoire Sequencing: This is single cell RNAseq data on RA_PhaseII B and T cells from synovial tissue and matched peripheral blood collected from 13 RA patients. Provided are the fastq files from both the gene expression and antigen receptor repertoire sequencing.
- Part of this data was used in “Granzyme K+ CD8 T cells form a core population in inflamed human tissue“, which is linked through this publication dataset
- SLE CEL-seq: This is single cell RNAseq data on SLE_PhaseI kidney tissue from 34 SLE cases and controls + urine on an overlapping subset. Provided are fastq files and cell type expression data.
- RA CEL-seq: This is single cell RNAseq data on RA_PhaseI synovium from 58 RA/OA cases. Provided are fastq files and cell type expression data.
- C1 HT seq. This is single cell RNAseq data on SLE_PhaseI kidney and skin tissue from 23 SLE cases and controls. Provided are fastq files.
- PBMCCyTOF: This is single cell CyTOF data on RA_PhaseI and SLE_PhaseI B cells, T cell, and myeloid cells from PBMCs from 79 individuals with Rheumatoid arthritis, Osteoarthritis, Systemic lupus erythematosus, and controls. Provided are raw, normalized, and debarcoded FCS files.