close

Вход

Забыли?

вход по аккаунту

?

697

код для вставкиСкачать
12: 385-390 (1996)
YEAST VOL.
oo O OO
0" VII % Yeast Sequencing Reports
0
0
0000°
Sequence Analysis of the 43 kb CRMl- YLM9-PET.54DIE2-SMII -pH081- YHB4-PFKI Region from the
Right Arm of Saccharomyces cerevisiae
Chromosome VII
QUIRINA J. M. VAN DER AARTt, KARL KLEINEf AND H. YDE STEENSMA*t§
?Institute of Molecular Plant Sciences, Leiden University, Wassenaarseweg 64, 2333 A L Leiden, The Netherlands
f Martinsrieder Institut f u r Protein Sequenzen, A m Klopferspitz 18a, 0-82152 Martinsried, Germany
§Delft University of Technology, Department of Microbiology and Enzymology, Julianalaan 67, 2628 BC Delft,
The Netherlands
Received 6 July 1995; accepted 23 September 1995
The nucleotide sequence of a 43 118 bp fragment from chromosome VII of Saccharomyces cerevisiae has been
determined and analysed. The fragment originates from the right arm of chromosome VII. It starts approximately
11 kb centromere-proximal to the pet54 marker and ends in the middle of the PFKI gene. The sequence contains a
small nuclear RNA gene (SNR7) and 29 open reading frames (ORFs) larger than 100 amino acids. Six of these were
completely internal to or partially overlapped other ORFs. Six previously described genes, YLM91MRPL9, C R M l ,
DIE2, SMZl, PH081 and YHB4, were mapped to this region in addition to pet54 and PFKI. Of the remaining 17
ORFs, four showed homology with other S. cerevisiae genes and four, including one of the partially overlapping
ORFs, with genes from other organisms. Eight ORFs had no homology with any sequence in the databases. The
actual sequences have been deposited in the EMBL database under Accession Number X87941.
KEY WORDS
~
Saccharomyces cerevisiae; chromosome VII; sequence; snRNa; SNR7
INTRODUCTION
MATERIALS AND METHODS
We have sequenced a 43 kb fragment from Succharomyces cerevisiae as part of the European project
to sequence the entire 1150 kb chromosome VII
DNA molecule. The segment originated from the
right arm of chromosome VII of strain S288C and
formed the yeast DNA insert of cosmid pEGH484
provided by H. Tettelin (Universite Catholique de
Louvain). The inserted DNA extends from 11 kb
centromere-proximal to the pet54 marker into the
middle of the PFKl gene. In this report we present
the sequence and the computer analysis of the
entire 43 118 bp fragment.
Strains and plusmids
*Corresponding author
CCC 0749-503X/96/040385-06
0 1996 by John Wiley & Sons Ltd
Cosmid pEGH484 containing a 43 kb yeast
DNA insert was received from H. Tettelin, the
DNA coordinator for chromosome VII. It is a
partial Suu3A fragment from chromosome VII of
strain S288C inserted in the unique BamHI site
of pWE15 (Evans and Wahl, 1987). Phagemid
pBluescript I1 KS+ (Strategene) was used for
sub-cloning and sequencing. Escherichiu coli
strain XLlBlue (recAl endAZ gyrA96 thi-l
hsdRl7 supE44 relAl luc[F'proAB lucPZAM15
TnlO(Tet')]; Bullock et ul., 1987) was used for
plasmid amplification.
386
Q. J. M . VAN DER AART ET
AL.
independently determined at least three times, with
a mean of 4.06 times per base.
Sequence analysis revealed 29 open reading
frames (ORFs) of more than 100 amino acids, a
small nuclear RNA gene and two ARS consensus
sequences. The ORFs were provisionally named
G85 followed by an arbitrary number. Final names
will be assigned when the sequences of the entire
chromosome VII DNA molecule has been obtained. The characteristics of the ORFs, listed in
Table 1, are discussed below.
Sequencing strategy
Four ORFs, G8539, G8555, G8583 and G8591
DNA sequencing was carried out combining were completely internal to other ORFs and all
primer walking and direct cloning sequencing ap- four run in opposite direction with respect to the
proaches. First, fragments varying from 2 to 10 kb larger ORFs. ORF G8517 partially overlapped
were sub-cloned and amplified. The sequence of G8520 ( YLMYlMRPL9, opposite orientation) and
the yeast DNA in the sub-clones was determined G8550 partially overlapped both G8555 (same
by primer walking. Junctions between fragments orientation) and the S M I l transcribed region,
were either sequenced from overlapping sub-clones G8553 (opposite orientation). It is unlikely that
or directly from cosmid DNA which was digested these six ORFs represent real genes, although
by appropriate enzymes. Sequence reactions on G8550 shows homology (starting at amino acid 77)
dsDNA as template were primed with the MI3 with 24 amino acids from an archaeal lipoprotein
universal and reverse primers or with synthetic attachment site (Mattar et al., 1994).
oligonucleotides (Pharmacia Nederland, RoosenThe sequences of eight ORFs, CRMl (Toda et
daal). We used the dideoxy method of Sanger et al. al., 1992; D13039), YLMYIMRPLY (Graack et al.,
(1977) with FITC-labelled dATP and T7 DNA- 1992; X65014IS37340), PET54 (Costano et al.,
polymerase (Pharmacia) on an automatic se- 1989; X13427), DIE2 (Nikawa and Hosaka, 1995;
quencer (ALF, Pharmacia). The sequences were D38049), S M I l (Fishel et al., 1993; L15423),
determined on both strands and each base was PH081 (Coche et al., 1990; S41074), YHB4 (Zhu
sequenced at least three times.
and Riggs, 1992; B45383) and PFKl (S38963) have
been reported previously, although the positions of
Sequence analysis
the corresponding genes, except PET54 and PFKl,
The Heidelberg Geneskipper program and on the physical and genetic maps of chromosome
the GCG sequence analysis software package VII are unknown. The published sequence of
(Devereux et al., 1984) were used for sequence the YLMYIMRPLY ORF (Graack et al., 1992;
alignments and analysis. Comparison of nucleotide X65014lS37340) showed three different bases in
and amino acid sequences to the data banks 807 bp compared with our data, but no differences
EMBL release 35 and PIR International release in the amino acid sequences. The sequences of
44.07 was performed with either FASTA or the DIE2 and G8547 differ in one base, whereas the
amino acid sequences are identical. The published
on-line MIPS package (Martinsried Institute).
sequence of PH081 (Coche et al., 1990; S41074) is
lacking one amino acid, an asparagine at position
974,
compared to our sequence. No further differRESULTS AND DISCUSSION
ences were found between the ORFs and the
The complete sequence of the DNA insert in published sequence data. Assuming that these difpEGH484 was determined in both directions. First ferences are all caused by sequence errors, which is
a restriction map was constructed for the restric- highly unlikely, the error rate for the ORFs would
tion endonucleases BamHI, BgflI, EcoRI, SalI, be seven differences in 13 341 bases or 0.05%.
Four ORFs show homology with other S. cerXbaI and XhoI (Figure 1). Using these sites, subclones were made. Some of the larger sub-clones evisiae genes. These include a drug-resistance gene,
were further sub-cloned as Hind111 or BgllI frag- SGEl (Amakasu et al., 1993; S46275; 53% identity
ments. The inserts of the sub-clones were se- and 75% similarity in 526 amino acids with ORF
quenced in both directions and each base was G8537). This ORF (G8537) also shows significant
DNA manipulations
Plasmid preparations were carried out using the
ammonium acetate method of Lee and Suraiya
(1990). DNA for automatic sequencing was
purified over Nucleobond-AX (Machery-Nagel,
Diiren) columns according to the manufacturer’s
instructions. Restriction endonucleases and T4
DNA ligase were used according to the recommendations of the suppliers (Boehringer, Pharmacia).
387
43 kb FRAGMENT FROM CHROMOSOME VII
Xa Bg
X
Xa
Bg
S
w -
rn
CRM 1
01
R
X R
R
Bg S
-PE T54
17
YLM9
A
R
23
30
Xa
S Bg
X
R
-37
41
Bg
DIE2
50
Xa
Bg
X
I
I
I
Bg
B
II
Xa
B R R
Xa
I
I
I II
I
YHB4
PH08I
64
-
75
R
Xa Xa R
111
78
S
R
I I
I
I
,
83
- - *
PH08 7
61
s
-
X B Xa
A
SMI 7
44
39
R Bg
Xa Bg
58
55
c
.
I
-- - - - - Xa
91
c
-c
81
93
85
96
PFK7
1 kb
U
Figure 1. Restriction map and ORF positions. The restriction map of the entire 43 kb chromosome VII insert in cosmid
pEGH484 is shown on top. For better representation the continuous map is divided into three parts, at a Sun and an XbaI site
respectively. B, BumHI; Bg, BglII; R, EcoRI; S, Sun; Xa, XbaI; X, Xhol. The arrows below the restriction map show the positions
of the ORFs. Numbers refer to the last two digits of the provisional names, e.g. 01=G8501. ‘A’ represents the positions of ARS
consensus sequences, ‘R’ that of the snRNA.
homology with an ORF on chromosome XI (44Yn
identity and 67% similarity in 3 16 amino acids with
YKRIOSC, S38184). Homology was further found
with a cell cycle gene, CDC20, which is involved
in cell division control (S48507; 48% identity and
68Yo similarity in 153 amino acids with O R F
G8.541) and a sporulation gene, SP012 (Malavasic
and Elder, 1990; S46756; 31% identity and 59%
similarity in 115 amino acids with ORF G8558).
Finally, ORF G8561 is 59% identical and 79Y0
similar in 217 amino acids with the yeast homolog
of prohibitin (S50315), which determines the
replicative life span.
Four ORFs have homology with sequences from
other organisms. ORF G8501, which is only
partially present in the sequenced fragment, shows
similarity with ion channels from higher eukaryotes
(Soldatov, 1992; Salkoff et al., 1987). As mentioned
above, ORF G8550, which partially overlaps both
G8555 and SMIl (G8553), has some amino acid
sequence similarity with an archaeal lipoprotein
attachment site (Mattar et al., 1994). O R F G8564
exhibits similarity with human ankyrin (25% identity and 56% similarity in 217 amino acids), mouse
and Drosophila. Ankyrin is a transmembrane
protein involved in differentiation (Milner and
Campbell, 1993). Finally, ORF G8578 has the
ATP/GTP binding site motif A (Linder et al., 1989).
The remaining eight ORFs do not show significant homology with any sequence in the databases.
505
115
137
315
228
1178
399
233
129
785
171
882
109
288
107
>194
G8553
G8555
G8558
G8561
G8564
G8567
G8572
G8575
G8578
G8581
G8583
G8585
G8591
G8593
G8596
G8599
1155
24 208
278 1
3298
11 430
1145
24 198
271 1
2929
11 217
W
W
C
C
C
0.44
0.36
0.77
0.41
0.33
0.50
0.46
0.41
0.59
0.47
0.40
0.43
0.49
22 656
21 621
24 061
25 238
26 401
29 968
32 859
33 820
34 577
37 416
35 944
40 448
40 175
41 814
42 315
43 118
21 142
21 277
23 651
24 294
25 718
26 435
31 665
33 122
34 191
35 062
35 432
37 803
39 849
40 951
41 995
42 536
C
W
W
C
W
C
W
C
C
C
W
C
W
C
C
C
4.42
9.19
11.22
10.62
6.42
5.67
6.25
10.06
6.67
9.85
8.45
8.18
11.36
6.67
9.70
11.36
0.44
0.42
0.45
0.45
6.43
8.74
5.62
10.23
6.90
15 039
18 129
18 758
20 751
21 463
14611
16 903
18 162
19 177
21 122
Fop, frequency of optimal codons.
~
-
-
-
~
C
W
C
W
W
143
409
199
525
114
G8539
G8541
G8544
G8547
G8550
ARS
ARS
Tau
Delta
snRNA
C
C
W
C
W
0.51
0.54
0.48
0.41
0.38
0.40
0.48
0.42
6.60
2570
5.28
7.87
10.99
7.73
10.01
9.76
9.58
FOP
PI
End
7550
8131
8641
10 750
12 559
13 973
16 402
4299
7793
7835
8885
11 681
12 630
14 564
W
W
1084
113
269
622
293
448
613
G8514
G8517
G8520
G8523
G8527
G8530
G8537
1
G8501
Start
W
>857
Name
Strand
orientation
No. of
amino
acids
Table 1. Characteristics of open reading frames.
Nikawa and Hosaka (1995)
Mattar et al. (1994) Overlap
with G8553 and G8555
Fishel et al. (1993)
Internal to G8553
Malavasic and Elder (1990)
PIR: S50315
Milner and Campbell (1993)
Coche et al. (1990)
Zhu and Riggs (1992)
Linder et al. (1989)
SGEI and YKRlO5C of
S. cerevisiae
Identical to DIE2
Archaeal lipoprotein
attachment site
Identical to SMII
SP012 of S. cerevisiae
Prohibitin of S. cerevisiae
MammalianlDrosophila ankyrin
Identical to PH081
Identical to YHB4
ATPIGTP binding site, motif A
Identical to snRNA snR7
Identical to PFKl
Patterson and Guthrie (1987)
TTTTATGTTTT
ATTTATGTTTT
PIR: S38963
Internal to G8585
Internal to G8581
Amakasu et al. (1993)
and PIR: S38184
Internal to G8537
PIR: S48507
Identical to PETS4
CDC20 of S. cerevisiae
Costano et al. (1989)
Identical to YLM9IMRPLY
Particularities
Soldatov (1992)
Salkoff et al. (1987)
Toda et al. (1992)
Overlapping G8520
Graack et al. (1992)
Ca2+-channelhuman
Na'-channel Drosophila
Identical to CRMl
Homology
00
389
43 kb FRAGMENT FROM CHROMOSOME VII
None of the ORFs in this fragment of the yeast
genome contains sequences indicative of introns or
contains an ‘RPG’ box or Hap2/Hap3/Hap4 binding site in its 5‘ upstream region and only one ORF
(G8596) has a GCN4 box in the promoter region
(Fondrat and Kalogeropoulos, 1994).
Two ARS consensus elements were found (position 1145-1155 and position 24198-24208). It is
unlikely that these are functional yeast replication
origins, since the first is in the middle of an ORF
and potential B-elements are not present in either
(Palzkill et al., 1986). In addition, a (solo) tau
element (position 271 1-2781) and a (solo) delta
element (position 2929-3298) are present. The
small nuclear RNA gene found at position 11217I1430 is identical to the previously described snR7
(US-like snRNA; Patterson and Guthrie, 1987).
Between several ORFs we found long stretches
of about 18 A and/or T residues. According to
Struhl (1985), approximately 25% of all yeast
genes contain poly(dA-dT) tracks of comparable
size in their upstream regions. They can act as
upstream promoter elements for constitutively
transcribed genes and also as barriers between
transcription units. Such stretches are present upstream of ORFs. These long stretches of A and T
are found upstream of G8541, G8550, S M I l
(G8553), G8558, YHB4 (G8572) and G8581, thus
in six out of 29 ORFs or 23, not counting the
internal or overlapping ORFs. In addition, in 80%
of the DNA sequences of yeast genes 8 bp dA-dT
stretches are found, usually several times per sequence and mostly located in non-coding regions
(Struhl, 1985). These short dA-dT stretches can
also be found in the fragments we have sequenced.
The frequency of optimal codons (Fop; Sharpe
and Cowe, 1991) was calculated for the various
ORFs. As may be seen from the Table, most ORFs
have a low or intermediate Fop and therefore
would be moderately or poorly expressed. Exceptions are the PFKl (Fop 0.71) and YHB2 (Fop
0.59) genes which might be highly expressed.
From the 23 complete, non-internal ORFs, 15 of
the putative proteins have a predicted PI above 8.0
and only four a PI below 6.0. Hence, for this
part of the yeast genome there would be many
more basic than acidic proteins, which is not in
accordance with the rest of the genome.
In summary, the 43 118 bp of the fragment
sequenced contain 27 complete ORFs, part of
another ORF and part of the PFKl gene. The
mean length of the 27 ORFs is 1156 bp or, excluding the six overlapping or partly overlapping
ORFs, 21 ORFs with a mean length of 1393 bp.
The gene density in this fragment is approximately
one gene per 1.9 kb and 75% of the DNA is
potentially coding. These figures correspond well
to previous data for the yeast genome (Oliver
et al., 1992; Dujon et al., 1994; Feldmann et al.,
1994; Johnston et al., 1994; Bussey et al., 1995).
The insert in pEGH484 thus represents a typical
piece of yeast DNA.
ACKNOWLEDGEMENTS
We thank H. Tettelin, coordinator of chromosome
VII sequencing, for providing the recombinant
plasmid. We are grateful to Linda van der Zanden
for technical assistance. This work was supported
by the Commission of the European Communities
under the BIOTECH program of the Division of
Biotechnology.
REFERENCES
Amakasu, H., Suzuki, Y., Nishizawa, M. and
Fukasawa, T. (1993). Isolation and characterization
of SGE1: a yeast gene that partially suppresses the
gall1 mutation in multiple copies. Genetics 134,
675-683.
Bullock, W. O., Fernandez, J. M. and Short, J. M.
(1987). XLl-Blue: A high efficiency plasmid transforming recA Escherichia coli strain with 8galactosidase selection. Biotechniques 5, 376-379.
Bussey, H., Kaback, D. B., Zhong, W. W., et al. (1995).
The nucleotide sequence of chromosome I from Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 92,
3809-3813.
Coche, T., Prozzi, D., Legrain, M., Hilger, F. and
Vandenhaute, J. (1990). Nucleotide sequence of the
PH081 gene involved in the regulation of the repressible acid phosphatase gene in Saccharomyces
cerevisiae. Nucl. Acids Res. 18, 2176.
Costano, M. C., Seaver, E. C. and Fox, T. D. (1989).
The PET54 gene of Saccharomyces cerevisiae:
Characterization of a nuclear gene encoding a mitochondrial translational activator and subcellular
localization of its product. Genetics 122, 297-305.
Devereux, J., Haeberli, P. and Smithies, 0. (1984). A
comprehensive set of sequence analysis programs for
the VAX. Nucl. Acids Res. 12, 387-395.
Dujon, B., et al. (1994). The complete DNA sequence of
yeast chromosome XI. Nature 369, 371-378.
Evans, G. A. and Wahl, G. M. (1987). Cosmid vectors
for genomic walking and rapid restriction mapping.
Methods Enzymol. 152, 604-610.
Feldmann, H., et al. (1994). Complete DNA sequence of
yeast chromosome 11. EMBO J. 13, 5795-5809.
390
Fishel, B. R., Sperry, A. 0. and Garrard, W. T. (1993).
Yeast calmodulin and a conserved nuclear protein
participate in the in vitro binding of a matrix association region. Proc. Natl. Acad. Sci. USA 90, 56235627.
Fondrat, C. and Kalogeropoulos, A. (1994). Approaching the function of new genes by detection of their
upstream activation sequences in Saccharomyces cerevisiae: application to chromosome 111. Curr. Genet.,
25, 396406.
Graack, H.-R., Grohmann, L., Kitakawa, M., Schafer,
K.-L. and Kruft, V. (1992). Ym19, a nucleus-encoded
mitochondria1 ribosomal protein of yeast, is homologous to L3 ribosomal proteins from all natural kingdoms and photosynthetic organelles. Eur. J. Biochem.
206, 373-380.
Johnston, M.. et al. (1994). Complete nucleotide sequence of Saccharomyces cerevisiae chromosome
VlII. Science 265, 2071-2802.
Lee, S. and Suraiya, R. (1990). A simple procedure
for maximum yield of high-quality plasmid DNA.
Biotechniques 9, 616- 679.
Linder, P., Lasko, P. F., Ashburner, M., et al. (1989).
Birth of the D-E-A-D box. Nature 337, 121-122.
Malavasic, M. J. and Elder, R. J. (1990). Complementary transcripts from two genes necessary for normal
meiosis in the yeast Saccharomyces cerevisiae. Mol.
Cell. Bid. 10, 2809-2819.
Mattar, S., Scharf, B., Kent, S. B. H., Rodewald, K.,
Oesterhelt, D. and Engelhard, M. (1994). The primary
structure of halocyanin, an archaeal blue copper protein, predicts a lipid anchor for membrane fixation.
J. Biol. Chem. 269, 14939-14945.
Milner, C. M. and Campbell, R. D. (1993). The G9a
gene in the human major histocompatibility complex
encodes a novel protein containing ankyrin-like repeats. Biochem. J. 290, 811-818.
Q. J. M. VAN DER AART ET AL.
Nikawa, J.-I. and Hosaka, K. (1995). Isolation and
characterization of genes that promote the expression
of inositol transporter gene I T R l in Saccharomyces
cerevisiae. Molec. Microbiol. 16, 301 -308.
Oliver, S. G., et al. (1992). The complete DNA sequence
of yeast chromosome 111. Nature 357, 3846.
Palzkill, T. G., Oliver, S. G. and Newlon, C. S. (1986).
DNA sequence analysis of A R S elements from chromosome I11 of Saccharomyces cerevisiae: Identification of new conserved sequence. Nucl. Acids Res.
14, 6247-6264.
Patterson, B. and Guthrie, C. (1987). An essential yeast
snRNA with a US-like domain is required for splicing
in vivo. Cell 49, 61 3-624.
Salkoff, L., Butler, A,, Scavarda, N. and Wei, A. (1987).
Nucleotide sequence of the putative sodium channel
gene from Drosophila: the four homologous domains.
Nucl. Acids Res. 15, 8569-8572.
Sanger, F., Nicklen, S. and Coulson, A. R. (1977). DNA
sequencing with chain-terminating inhibitors. Proc.
Natl. Acad. Sci. USA 14, 5463-5461.
Sharpe, P. M. and Cowe, E. (1991). Synonymous codon
usage in Saccharomyces cerevisiae. Yeast 7, 657-678.
Soldatov, N. M. (1992). Molecular diversity of L-type
Ca2' channel transcripts in fibroblasts. Proc. Natl.
Acad. Sci. USA 89, 46284632.
Struhl, K. (1985). Naturally occurring poly(dA-dT) sequences are upstream promoter elements for constitutive transcription in yeast. Proc. Nut1 Acad. Sci. USA
82, 8419-8423.
Toda, T., Shimanuki, M., Saka, Y., et al. (1992). Fission
yeast pap1 -dependent transcription is negatively regulated by an essential nuclear protein, crml. Mol. Cell.
Biol. 12, 5474-5484.
Zhu, H. and Riggs, A. F. (1992). Yeast flavohemoglobin
is an ancient protein related to globins and a reductase
family. Proc. Natl. Acud. Sci. USA 89, 5015-5019.
Документ
Категория
Без категории
Просмотров
2
Размер файла
465 Кб
Теги
697
1/--страниц
Пожаловаться на содержимое документа