SPring-8, the large synchrotron radiation facility

Skip to content
» JAPANESE
Personal tools
 

World's Largest Scale Database of Experimental Data for Crystal Structure Analysis of Protein - Database of experimental data including 900,000 protein crystallization conditions is now publicly available - (Press Release)

Release Date
23 Jul, 2009
 
RIKEN has made its database of experimental data of proteins publicly available on the web. The database mainly includes X-ray crystal structure analysis data of microbial proteins, and is useful for bioscience research.

RIKEN

Key research achievements
• Accumulation of a huge amount of experimental data for the crystal structure analysis of proteins obtained using high-brilliance X-rays at SPring-8
• Effective utilization of the experimental data as reference data for protein research and for the development of new methods of analysis
• Shared database using the Semantic Web enabling easy reuse and automatic processing of data (a world's first in this area)

RIKEN (Ryoji Noyori, President) has made its database of experimental data of proteins publicly available on the web.  The database mainly includes X-ray crystal structure analysis data of microbial proteins, and is useful for bioscience research.  This was achieved through cooperation with Yukihiko Asada, a research associate, and Naoki Kunishima, an assistant group director of the Protein Crystallography Research Group at RIKEN SPring-8 Center (Tetsuya Ishikawa, Director).  Data from the database can be downloaded from the RIKEN Life Science Networking System "RIKEN SciNes"*1 provided by the Bioinformatics and Systems Engineering Division of RIKEN (Tetsuro Toyoda, Director) from July 23 onwards (http://database.riken.jp).

Determining the steric structures of proteins by X-ray crystal structure analysis and other methods is necessary to understand the phenomena of life at the atomic level and apply the obtained results in the fields of medicine and industry.  As a result of the structural genomics projects*2 recently promoted worldwide, a fundamental research system, which enables life scientists to easily carry out X-ray crystal structure analysis, has been established.  The research group has made public a systematic and detailed database (data size: 5.0 TB = 5 × 1012 B, number of files: 97 million) of experimental data accumulated using high-brilliance X-rays at SPring-8 that can be used for crystal structure analysis of microbial proteins, variant proteins,*3 and proteins labeled with heavy atoms.  This aims to apply the achievement of the structural genomics projects into our society.

To obtain the crystal structure of a target protein, several processes are necessary: 1. preparation of a sample, 2. crystallization, 3. X-ray diffraction experiment, and 4. determination of the steric structure by calculation.  A huge amount of data is obtained from each process.  Experimental data in the database that can be used for the crystal structure analysis of microbial proteins are made publicly available in a format such that the users can reedit the data by themselves for their research purposes, facilitating the development of software to efficiently determine the steric structures of proteins.  Furthermore, because this is the world's largest scale database, containing a huge amount of information on similar proteins, the database will support the protein research carried out by life scientists.

In the database of variant proteins, experimental data on numerous variant proteins, obtained through the development of methods of crystal structure analysis by the research group, have been made public.  These variant proteins have unique advantages for comparing the characteristics of several proteins in that they were crystallized under similar conditions.  Therefore, making public the experimental data in a format to enable the users to reuse the data will contribute to the acceleration of data use in the field of bioinformatics*4 and the development of drugs.

For the crystal structure analysis of proteins, the preparation of protein crystals labeled by a reagent containing heavy atoms such as platinum is sometimes necessary.  The research group developed the HATODAS software, which is the world's most advanced software in this field and can perform a Web search for the appropriate heavy-atom reagent on the basis of the amino acid sequence and the solution conditions of the target protein, to further promote the crystal structure analysis of proteins among life scientists.  In the database of experimental data obtained from the crystal structure analysis of proteins labeled with heavy atoms, the accumulated data in HATODAS have been made public so that users can reuse the data, facilitating the use of data in the field of protein engineering.*5

This project was carried out as part of the Integrated Database Project*6 supported by the Ministry of Education, Culture, Sports, Science and Technology (MEXT).


<Figure>

Fig. 1 Flow of crystal structure analysis of proteins and experimental data Fig. 1 Flow of crystal structure analysis of proteins and experimental data


Fig. 2 Heavy-atom database, HATODAS Fig. 2 Heavy-atom database, HATODAS (http://hatodas.harima.riken.jp)
(Sugahara et al. (2005) Acta Cryst. D 61, 1302-1305;
 Sugahara et al. (2009) J. Appl. Cryst. 42, 540-544)
Search result
Experimental data for crystal structure analysis


Fig. 3 Sample database screen of RIKEN Life Science Networking System "RIKEN SciNes" Crystal structure experimental data
Fig. 3 Sample database screen of RIKEN Life Science Networking System "RIKEN SciNes" Heavy-atom experimental data
Fig. 3 Sample database screen of RIKEN Life Science Networking System "RIKEN SciNes" (http://database.riken.jp)


<Glossary>

*1 RIKEN SciNes
RIKEN Life Science Networking System (RIKEN SciNes) is the foundation system for making information publicly available, which is managed by the Bioinformatics and Systems Engineering Division of RIKEN.  RIKEN SciNes integrates a fundamental system for developing databases on life-science-related data in research institutions, enabling a large amount of data to be made publicly available in accordance with an international standard called the Semantic Web.  Using RIKEN SciNes, individual researchers who belong to a research institution do not need to maintain a Web server by themselves; instead they can smoothly transmit and publicize their original database as research results to interested parties outside of the institution.  In this way, SciNes serves as an information system by which Japanese researchers can contribute to international collaborative research.  
Reference Websites:
http://omicspace.riken.jp/publications/toyoda2009ja.pdf
http://www.riken.jp/r-world/info/release/press/2009/090331_2/detail.html

*2 Structural genomics project
Information on the steric structures of proteins is necessary to understand organisms at the atomic level.  After genomics, i.e., deciphering the genomes of various types of organisms, large-scale worldwide research projects are underway to thoroughly determine the steric structures of proteins encoded by the genomes and provide the results as a foundation for research and development on the basis of steric structures.  This field of study is called structural genomics.  In Japan, for five years from FY 2002 through FY 2006, a national project on structural genomics called the MEXT Protein 3000 Project was carried out.  Under this project, the steric structures of various proteins were determined and a research foundation enabling efficient structural analysis was established.

*3 Variant proteins
By genetic manipulation, a specific amino acid residue of a target protein can be replaced with an arbitrary type of amino acid (site-specific mutagenesis).  Variant proteins are mutation-induced wild-type proteins produced by mutagenesis.

*4 Bioinformatics
Bioinformatics is a field of study in which biology-related problems are solved by techniques applying applied mathematics, information sciences, statistics, and computer sciences.  Bioinformatics is also called bioscience and informatics.  A huge amount of biological information has recently become available through genome projects and structural genomics projects targeting various types of organisms.  The use of such information for the development of useful bioinformatics technologies, such as the systematic analysis and prediction of structures and interactions of proteins, is desired.

*5 Protein engineering
Proteins perform various functions to sustain life and play various roles in living organisms.  The aim of protein engineering is to design artificial proteins with desirable functions by artificially modifying natural proteins.  Each protein has a specific amino acid sequence and performs specific functions by forming a specific steric structure.  Accordingly, genetic manipulation is commonly used as a method of protein engineering to produce variant proteins.

*6 Integrated Database Project
The Integrated Database Project is a project supported by MEXT to promote the integration of the life information database as a base for supporting life-science-related research.  To improve the usability of the life-science-related database in Japan, the Project promotes the integration of the database through planning, evaluation, and support of strategies to maintain the life-science-related database, develop fundamental technologies to realize integrated databases, maintain portal sites, and other measures.  RIKEN has participated in a subsidiary program of the Integrated Database Project with the title "Plant Omics and Steric Structure Information of Proteins," which is a 4-year project that started in FY 2007, and provided experimental data and other materials to the Integrated Database Project.


For more information, please contact:
Dr. Naoki Kunishima (RIKEN SPring-8 Center)
e-mail:メール.

Previous Article
Clarifying the source of superconductivity of a new superconductor, CaC6 (Press Release)
Current article
World's Largest Scale Database of Experimental Data for Crystal Structure Analysis of Protein - Database of experimental data including 900,000 protein crystallization conditions is now publicly available - (Press Release)
Next Article
Dramatically Breaking World Record for Strength of Magnetic Field in Ultrahigh-Magnetic-Field X-Ray Spectroscopic Experiment (Press Release)