1 Automated high-throughput analysis of metal atoms in biological macromolecules using ion beam analysis Charlie Bury, Elspeth Garman University of Oxford, Department of Biochemistry and Geoff W. Grime (and Chris Jeynes) University of Surrey, Ion Beam Centre Ion Beam Centre
2 Outline PIXE (particle-induced X-ray emission) analysis of proteinsIon Beam Centre Outline PIXE (particle-induced X-ray emission) analysis of proteins High throughput methods in biochemistry High throughput Ion Beam Analysis Sample presentation Analysis protocols Data processing Examples
3 Metals in proteins Many proteins contain small numbers of metal atomsIon Beam Centre Many proteins contain small numbers of metal atoms Binding and transport of metals A single metal atom helps to determine the folded shape of the molecule X-ray crystallography measures electron density Cannot determine Z of metal atoms Z is often inferred indirectly from molecular modelling Unknown metal atoms
4 PIXE (particle-induced X-ray emission) analysis of proteinsStructure 1999 PIXE conference 2004 Progress in Biophysics & Molecular Biology 2005 Concentrations are typically 1 atom per molecule of 10 – 100 kDa (10s to 100s ppm) Available sample size is small (microlitres of solution) MicroPIXE may be the only way to identify and quantify unknown metal atoms in proteins
5 Quantification Absolute PIXE concentrations have a large errorBut: most (95%) proteins contain known numbers of molecules of the amino acids methionine and cysteine These contain sulfur which provides an internal standard: The expected results are small integers (or ratios of small integers) so high accuracy may not be essential NZ = number of unknown metal atoms NS = KNOWN number of sulfur per molecule C = Concentration determined by PIXE (wt/wt) M = atomic mass Sulfur atom
6 Dependence on sample thicknessResults MUST be corrected for sample thickness Simultaneous RBS provides matrix composition and thickness and allows correction Zn:S Mg:S Elemental mass ratios calculated for peaks of equal area assuming different thicknesses of matrix C5O (and 3MeV protons)
7 An example Cytochrome C550 from Paracoccus denitrificans15kDa, 135 amino acids 1 Fe atom and 6 S atoms from methioneine and cysteine residues. Sample: 0.3l drop of protein in buffer solution at 1.35mg/ml. Total mass of protein = 0.41g Thickness-corrected concentrations from PIXE S: ppm Fe: 430 ppm Number of Fe atoms: 430 x 32 x 6 / 1520 x 56 = 0.98 (±0.18) The Fe is an essential component of the molecule so the result is always 1 Fe/mol. Cytochrome C can be used as a QA material
8 Non-integer results? Many measurements yield non-integer results: x atoms per molecule This is not necessarily poor technique … Metal binding proteins Many proteins have a metal binding or transport function. In addition to the ‘structural’ atoms (S) they contain one or more (N) metal atom binding sites which may or may not be occupied: Polymers Many proteins function by forming polymers where two or more (M) identical monomer molecules come together to create, for example, a metal binding site. The measured metal content relates to the monomer molecule so:
9 A typical problem … is the Fe atom present? Frataxin:a novel synthesised protein for the delivery or detoxification of iron. A trimer of identical monomer molecules forming a ‘cage’ for the Fe atom. … is the Fe atom present? When the site is occupied x = 1/3 Smaller values indicate partial occupancy
10 What precision do we need?M is rarely greater than 4. Accuracy of ±0.1 in x is sufficient to quantify fully occupied sites. The precision of the ratio of peak areas is given by where M and S are the Gaussian counting errors of the metal and sulfur peaks in the spectrum. In the case where the peak areas are similar, this implies that both the S and metal peaks should have >200 counts (7% precision).
11 Another example: The FurB monomer of M. Tuberculosis.Iron binding protein identified (Ferric Uptake Regulator): with three metal sites how many of these are occupied? One, but the metal is ZINC! The paper was quickly rewritten… (J.Biol.Chem, 282, 2007)
12 This method has been used successfully for over 20 years:Sample Elements of interest Result atoms/molecule Reference Reverse transcriptase + DNA oligodeoxynucleotide 22:9mer P 1.2 (± 0.5) oligonucleotide per dimmer * (Jones et al., 1993) Viral neuraminidase (subtype N8) Ca Present * (Taylor et al., 1993) Bacterial neuraminidase (VC) (Crennell et al., 1994) Glucose dehydrogenase Zn (John et al., 1994) Epidermal growth factor like domain of factor IX 1.7 (± 0.2) * (Rao et al., 1995) p13(suc1) cell-cycle control protein 1.3 (± 0.2) (Endicott et al., 1995) Protein phosphatase 1 (native) Mn Fe 0.93 (± 0.08) 0.46 (± 0.05) *(Egloff et al., 1995) Protein phosphatase 1 (with bound tungstate) 0.92 (± 0.08) 0.45 (± 0.05) Reduced desulphoferrodoxin 1.1 (± 0.1) 0.5 (± 0.05) A Coelho, unpublished data Oxidised desulphoferrodoxin 2.3 (± 0.15) 1.35 (± 0.15) Class I ribonucleotide reductase R2 subunit mutant D84A Hg bdl † 0.23 (± 0.03) 1.5 (± 0.12) (Nordlund et al., 1990)
13 This method has been used successfully for over 20 yearsClass III anaerobic ribonucleotide reductase Fe Zn Cu bdl † 1.5 (± 0.3) 3.0 (± 0.5) (Logan et al., 1999) Blue tongue virus (BTV) core Counter ions Ca, Zn at low concentration *(Gouet et al., 1999) VP7 capsid protein from BTV Present (Basak et al., 1997) CD80 (B7) Se S:Se = 1.8 (± 0.2) *(Davis et al., 2001) Nitrous Oxide Reductase (liquid sample) 1.3 Arif Jafferji, D.Phil thesis U. of Oxford, 2000 Nitrous Oxide Reductase (crystal) 4.4 (± 0.3) RhoA.GDP.AlF4-/RhoGAP RhoA.GDP/RhoGAP Al:P Mg:P Mg:Al Al 0.39 (± 0.05) 0.34 (± 0.04) 0.9 (± 0.16) 0.86 (± 0.1) † bdl *(Graham et al., 2002) DAF34 (Au soaked crystal) Au *(Garman and Murray, 2003) (Williams et al., 2003) DAF34 (Hg soaked crystal) Hg DAF34 (HgI crystal) I 1 cd1 Nitrate Reductase 1 (± 0.3) (Fulop et al., 1995) Pseudoazurin 2000 0.95 (± 0.1) (Williams et al., 1995)
14 This method has been used successfully for over 20 years:Tau-D (crystal) Fe 0.003 (Elkins et al., 2002) Wild type tyrosinase Variant 1 tyrosinase Variant 2 tyrosinase Cu 1.62 (± 0.1) 1.55 (± 0.1) bdl † *(Branza-Nichita et al., 2000) RsbW Se Present R. Lewis, personal communication SpOA P Ferric Uptake Regulator (preparation 1) Present (near mdl) *(Pohl et al., 2003) Ferric Uptake Regulator (preparation 2) Co ~2 GCM Zn 1.70 (± 0.15) *(Cohen et al., 2002) ssDNA + binding protein complex (washed crystal) 4.4 (± 0.3) *(Backe et al., 2004) ssDNA binding protein complex without ssDNA (crystal) 1.1 (± 0.4) x10-2 Ferric Uptake Regulator 2.6 (± 0.3) 1.9 (± 0.2) BslI Restriction Endonuclease 7.3 (std. dev. 1.0) *(Vanamee et al., 2003) FIH, an Fe(II) dependent enzyme (Elkins et al., 2003) Anastellin S (Briknarova et al., 2003) NapC 0.8 (std. dev. 0.5) (Cartron et al., 2002)
15 This method has been used successfully for over 20 years:EMR2 (Ba soaked crystal) Ba 2.4 (std. dev. 0.05) *(Garman and Murray, 2003) (Abbott et al., 2004) Cytochrome c550 from P.denitrificans Fe 0.97 (std. dev. 0.18) (Tinmkovich and Dickerson, 1976) NAG6P Diacetylase (crystal) Mn Cu Zn K 0.59 (std. dev. 0.03) 1.4 (std. dev. 0.2) 0.05 (std. dev. 0.02) 0.2 (std. dev. 0.06) 0.1 (std. dev. 0.04) *(Vincent et al., 2004) DAF 1,2,3,4 (Hg soaked crystal with Se substitution) Se Hg 2.4 (std. dev. 0.2) 0.02 (std. dev ) (Lukacik et al., 2004) DAF 1,2,3,4 (Au soaked crystal) Au 12 (std. dev. 1.7) Germin Type IV (pH 5.0) Type II (pH 8.0) 1 *(Woo et al., 2000) ThiND 10.1 (±0.9) E. Rudiňo Piňera, private communication
16 Surrey is now making this available as a service.This method has been used successfully for over 20 years: More recently: The effect of X-ray absorbed dose and pH over a fungal laccase: insights into redox potential determination and conformational changes. De la Mora et al., Acta Cryst. (2012) D68, VARP Is recruited onto endosomes by direct interaction with retromer, where together they function in export to the cell surface. Hesketh et al.. Developmental Cell (2014), 29, A complex iron-calcium cofactor catalyzing phosphotransfer chemistry Shee Chien Yong et al. Science (2014), 345, Plant cysteine oxidases are dioxygenases that directly enable arginyl transferase-catalysed arginylation of N-end rule targets. White, M. D. et al. Nature Communications (2017) 8, doi: /ncomms14690. Over 60 papers with references to microPIXE in mainstream biological journals Surrey is now making this available as a service.
17 There are still issues Samples are prepared by manual pipetting onto foils Samples are analysed by manual positioning on the PIXE maps to locate the precipitated protein. Differential precipitation of buffer may require accurate positioning using elemental maps of sulfur. Spectra are processed manually It is difficult to analyse more than 10 samples in a run day
18 High throughput Many experiments in biochemistry are now carried out using high-throughput (HT) models Several parameters are varied simultaneously giving an array of samples to be analysed. E.g.: Robotic ICP-MS Array printing with fluorescence imaging Can we develop an automated IBA method to engage with this type of experiment? Bring down cost per sample and permit a wider range of applications?
19 Sample preparation Microarray printingInkjet deposition of arrays of spots of different solutions Typical volume of spot 10 nL Size and separation controllable in the 10 to 100 µm range Example of fluorescence readout of antibody binding (from Biotechniques 54, 2013, 257)
20 Printed arrays for microPIXE?Special requirements for IBA Thin film substrate (4 μm polypropylene) Spot adhesion? Film damage during printing? Surface wetting issues? Crosstalk between spots Reproducible positioning (to simplify sample localisation) Film must remain flat during printing Several arrays per sample holder (to optimise beam time)
21 Sample preparation: support film and printingSample holders have the same dimensions as a standard microcope slide and are adapted for compatibility with both printer and sample stage Five 8 × 8 mm sample windows per slide covered in polypropylene film using a specially developed coating machine and non-instant contact adhesive Samples supplied as solutions in well plates Printed by a non-contact ArrayJet microarrayer Up to 144 samples per 12 × 12 array, 5 arrays per slide 21
22 Printed arrays Buffer array Protein array60 µm diameter drops, 200 µm apart Note the positioning errors and the shape variation
23 Cross-talk? Array of standard salt compounds.Verify that there is no cross talk between adjacent spots. CaCl CsI 23
24 Locating and analysing the spotsSpots are difficult to see under light microscopy Elemental maps have poor statistics so cannot be used easily for automatic location Spot position errors: Random errors in printing Errors in the printer head position. Variation in shape due to wetting Geometric distortion of the array image: Scanning coil misalignment (common in quadrupole systems) Misalignment of axes (scanning, sample stage motion, array printer motion) We require a system which allows for these errors An array of KBr buffered protein spots and the K Kα X-ray map (2 x 2 mm)
25 Finding the spots Print ‘Landing Lights’ at the corners of the array. Spots of metal salt (e.g. KBr) which are easy to find with PIXE. Move the stage to each corner (operator control) and use a least-squares fitting routine to find the centre of the spot from the PIXE map. (This is the only manual operation for each array) Store the stage coordinates of the corners Interpolate the stage coordinates of each cell in the array. This corrects for linear geometric distortions,
26 Analysis procedure Once the corners have been located, the remainder of the procedure is automated. For each spot in the stored list: Move the stage to the interpolated coordinates of the centre of the cell. Start a mapping run of the full cell area until a circular region can be fitted around the spot with good statistics. Change the scan pattern to the fitted circular pattern (to avoid scanning a lot of blank foil...) Run until the spectrum end conditions are reached. View through the chamber microscope near the end of a HT run. Analysed areas can be seen as faint squares around the spots
27 Spectrum endpoint determinationA problem with automated analysis is determining when to stop. Use a real-time Gaussian peak fitting routine to monitor the statistics of the major lines of the elements of interest. These are defined by the end-user in the sample information file. Stop the run when: All of the peaks of interest have counting error less than a specified value AND The total count in the RBS spectrum exceeds a minimum value OR A maximum run time is reached (to allow for missing samples - in both senses...) Elements of interest Gaussian counts Peak error S 6600 3.40% Zn 220 6.70% Se 104 9.80% Run stopped when all peaks < 10% (7.5 min)
28 (spectrum processing)Data handling – spectrum processing Data from the end-user and the ArrayJet spot mapping information are combined into a CSV file CSV file used as input to set up the data acquisition OMDAQ (data collection) Sample data and run statistics written to XML XML file used to set up batch processing of RBS and PIXE (with human supervision at present!) OMDAQ + GUPIX (spectrum processing) Element concentrations for each run written back to XML XML file used to generate atoms/molecule results for end-user
29 Summary of the first run36 ‘blind’ protein samples Total run time 7.5 hours (2.5 MeV protons, ~0.5 nA). This would have required 4 days using the manual method. All proteins analysed have an entry in the Protein Data Bank (PDB) and are believed to be metal binding proteins
30 Table of results PDB ID Gene Residues Metal in PDB Metals in PIXE (>3xLOD) Potential metals in PIXE (1-3xLOD) PIXE data consistent with PDB 1 3NNG BfR258E 168 Ca Ca (1.7) Fe 2 2KPN BcR147A 103 Ca (0.8) 3 3LRQ HR4604D 100 Zn Zn (2.5), Fe (0..3) Ca, Co, Cu 4 3NNQ OR3 114 Ca, Zn* Fe, Ni* 5 N/A LkR105 290 - Fe (0.04) Ca, Cu 6 2K52 MjR117B 80 Ca (0.2) 7 3ESI EwR179 129 Ca, Fe 8 3DM3 MjR118E 105 Nau 9 3I24 VfR176 149 Co 10 3L8M SyR86 212 11 3FOJ SyR101A Ca, Fe, Cu 12 4EVW VcR193 255 Mgu 13 2KW4 DhR1A 147 Ca, Fe* 14 3DJB BuR114 223 Fe, Ni Sample too dilute for PIXE (no S signal) 3D3N LpR108 284 K, Mn 3DC7 LpR109 232 Mg/Nau *S signal was below 3 times the limit of detection, so accurate stoichiometries could not be established. uPresence of sodium and magnesium could not be confirmed at the proton energies used in these experiments. Thanks to Eddie Snell of the Hauptman-Woodward Medical Research Institute, Buffalo for permission to present these data.
31 Table of results PDB ID Gene Residues Metal in PDB Metals in PIXE (>3xLOD) Potential metals in PIXE (1-3xLOD) PDB inconsistent with PIXE 1 3LV4 BiR14 456 Ca - Ca, Mn 2 3HIX NsR437I 106 Mn 3 3HLY SnR135D 161 4 3DCP LmR141 283 Fe/Zn Ca (3.3), Mn (0.5), Fe (1.2), Co (1.2) Zn 5 3JSR NsR236 119 K 6 3ILM NsR437H 141 Fe, Co 7 3I24 SoR237 137 Na Co (0.7), Zn (0.7) Fe, Ni 8 3GGL BtR324A 169 Ca, Mn, Fe* 9 3KB1 GR157 262 Co PDB ID Gene Residues Metal in PDB Metals in PIXE (>3xLOD) Potential metals in PIXE (1-3xLOD) Extra metals present in PIXE 1 3LMC MuR16 210 Fe/Zn Fe (0.6), Co (0.9), Ni (0.4), Zn (0.7) - 2 3K2Q MqR88 420 Nau Ca (7.1) Fe 3 3LM8 SR677 222 Mgu Ca (0.7), Fe (0.05) K/Br 4 3E5Z DrR130 296 Ca* 5 3HNM BtR319D 172 Ca (1.74) 6 3DEV ShR87 320 Mn (0.8), Fe (0.7) 7 3IHK SmR83 218 Ca (0.5), Fe (0.1) Ti, Co, Cu 8 3KB4 NsR141 225 Mn (0.2), Fe (0.4), Ni (0.4) Co 9 3E48 ZR319 289 Ca, Fe, Cu More than half of the proteins analysed were inconsistent with their entry in the PDB! This highlights a deep problem in identifying metal constituents of proteins.
32 Conclusion High throughput PIXE/RBS analysis of proteins (and other biological macromolecules) offers precise determination of the identity and quantity of metal atoms with a protocol compatible with standard biochemical methods and reduced cost per sample. Further development is planned to increase further the throughput (detector solid angle, higher count rates). 30% of all known proteins are metallo-proteins and knowing the correct metal reliably will be a major step in understanding human health and disease and in drug discovery.
33 Thanks to … Oliver Zeldin who developed much of the high throughput protocols as part of his Oxford D.Phil work. Vladimir Palitsin of Surrey IBC for his help in developing the sample holders for printed arrays. Eddie Snell of the Hauptman-Woodward Medical Research Institute, Buffalo, NY for selecting and providing the protein samples for the demonstration experiment.