Wednesday, July 8, 2020

Finally a natural histidine esterase?

During each week of protein structure releases at the PDB, there are typically a few of interest to me, sometimes ones of considerable interest because I've wanted to see them for years (often GPCRs, ion channels, etc.). I rarely take the time to comment on them here, as I barely use this blog anymore. Then there are ones I don't even think to ever expect to see, but when I do, they raise such interesting ideas or questions that I take note. At the risk of partly spoiling the story that is told when this research is finally published, I describe one of those here.

Several weeks ago, a structural genomics consortium released a structure of a member of the ELOVL family of fatty acid elongases, specifically human ELOVL7. These multipass transmembrane proteins catalyze a reaction analogous to the ketoacyl synthase (KS) subunit of fatty acid synthase or the corresponding FabF enzyme in bacteria, which are soluble proteins--namely the decarboxylative Claisen condensation of a malonyl thioester with a straight chain acyl thioester to yield a beta-ketoacyl thioester. The thiol groups are provided by the ACP protein for fatty acid synthase/FabF and coenzyme A for the ELOVL proteins, but otherwise the reaction is identical.

The ACP-dependent, soluble elongases all use an active site cysteine side chain to accept the straight-chain acyl group from the "first" ACP molecule and transfer it to the acetyl enolate formed from the malonyl thioester:

However, the ELOVL proteins lack a conserved cysteine, which has made their mechanism unclear. What was known was that the most highly conserved sequence is a HXXHH motif in one of the transmembrane helices, which was tentatively proposed to bind a metal ion, and there are also several other highly conserved polar amino acids in the membrane-embedded regions (Hernandez-Buquer and Blacklock 2013).The use of a serine or threonine was proposed, or possibly a direct acyl transfer between two CoA groups simultaneously bound to the protein.

The new structure, which contains a bound product analog, clearly shows that there is only sufficient space for a single CoA group, and there is no metal ion present. Furthermore, the product analog forms two covalent bonds to the protein that suggest the catalytic mechanism. The precise structure of the analog is not provided in the PDB entry, but the coordinates are consistent with it having been a gamma-delta unsaturated, alpha-halo beta-ketoacyl CoA derivative. The first histidine of the HXXHH motif interacts with one of the amide groups of the CoA backbone and is unlikely to be a catalytic residue. The second histidine is hydrogen bonded to the third, and has undergone an apparent Michael addition to the delta position of the analog. Also, a fourth histidine remote in the primary sequence has displaced the presumed halide from the beta position to form a second covalent bond.

The location of that remote histidine (His 181), near the beta carbon and between the two carbonyl groups, suggests that it likely stabilizes the negative charge developing on the acetyl group during the decarboxylation step by hydrogen bond donation, and possibly also acts as the general acid that protonates the thiol leaving group in the acylation step.

The second histidine of the HXXHH motif (His 150), however, is deeper in the pocket (nearer the acyl tail) and together with the adjacent His 151 and an aspartate side chain on another helix, form a His-His-Asp triad reminiscent of the Ser-His-Asp triad of the serine proteases. Its position relative to the product analog, the membership in the triad, and the observed reactivity of this histidine strongly implicate it as the nucleophile that carries the transferring acyl group. This implies a mechanism similar to the following:
1. The long-chain acyl-CoA binds to the enzyme.
2. His-150 attacks the thioester carbonyl and retains the acyl group as an acyl-imidazole. Its nucleophilicity is enhanced by a proton relay, similar to the serine in serine proteases, and possibly by the ability of the developing positive charge to be shared across the two imidazole rings of the "HH" motif, which are relatively coplanar. An "oxyanion hole" formed by a lysine and an asparagine stabilizes the tetrahedral intermediate.
3. The released free CoA unbinds and malonyl-CoA binds.
4. Due to its carboxyl group being in a negatively charged pocket near Asp 130 and a nearby glutamate, and its acetyl fragment being near the His 181 NH proton, the malonyl group decarboxylates.
5. The resulting enolate attacks the His 150-bound acyl group, forming the product, in a step that may be separate or concerted with the decarboxylation.

The use of an acylhistidine intermediate is, to my knowledge, unprecedented in biology. Imidazoles as acyl transfer catalysts are, however, known from synthetic organic chemistry, for instance as part of the reagent carbonyldiimidazole. Phosphohistidine intermediates are relatively common in enzymes, appearing in several classes of phosphatase, but enzymes that transfer acyl groups have only been shown to use serine, cysteine, or occasionally threonine.  However, approximately a year ago, Anthony Green and colleagues published the first successful de novo design of an artificial esterase using a histidine as a nucleophile(Burke et. al. 2019). It seems this may turn out to be one of the many cases in chemistry where nature "invented" something first. Comparison with the ELOVL7 structure offers an opportunity to compare how evolution and human design accomplish the same "trick".

In the case of Green's group's de novo esterase, a key requirement was methylation of the histidine to make the acylation reaction reversible. While methylated histidines are not unheard of in biology, in fact there's a "famous" example in the abundant protein actin, it appears that when used as a nucleophile, evolution has favored the use of an extended network of hydrogen bonded interactions instead.

The other interesting question is why this method of catalysis involving histidine is only used in this one example out of the many dozens of acyl-transferring enzymes that have been studied. Perhaps the use of an aromatic nuclophile, that can more easily be embedded in a desolvated transmembrane pocket than a small highly polar side chain, has favored its use here. All other intramembrane acyl transfer enzymes that I am aware of, such as protein palmitoyltransferases(PDB: 6BMS) and the MBOAT proteins(e.g. PDB: 6BUG), carry out direct attack of a soluble acceptor on a membrane-bound acyl-CoA molecule, without the need for an acyl-enzyme intermediate at all.

As this structure was determined by a structural genomics consortium as opposed to a dedicated lipid biosynthesis research group, I don't know how extensive the publication accompanying this structure will be once it is published. However, assuming that the authors agree on the implications of their structure, I am curious to see if they will present any biochemical data regarding the existence of the acylhistidine intermediate. All residues involved in the above mechanism are not only highly conserved, their mutation has (in most cases strongly) detrimental effects on catalysis in the ELOVL family member where it was tested (Hernandez-Buquer and Blacklock 2013), but this does not prove which residue carries the acyl group.

Friday, March 24, 2017

Signaling inside out

G protein-coupled receptors are a very large superfamily of transmembrane proteins that transmit signals into cells. They all have seven membrane-spanning helices, and most of them can activate a so-called G-protein that binds guanine nucleotides. I say “most” because there are some families that are sometimes grouped with GPCRs that seem to signal independently of G-proteins, for instance the Hedgehog receptor Smoothened and the PAQR receptors that are the subject of this post.

The early and mid-2000s saw the first structures of these receptors, at first in the “off” state with antagonists bound, and later in the “on” state with agonists bound. I have posted about some of these structures before. These structures revealed that when an agonist binds to the outside of the receptor, the outer halves of several of the helices in the bundle move slightly closer together, and this triggers, in a kind of “see-saw” manner, their outer halves to splay apart, creating the G-protein binding site. In particular, the outside ends of TM5(blue-green), TM7(red-orange), and to a lesser extent TM6 (light orange) move in, and the outer end of TM6 moves dramatically out, with TM5 and TM8 moving less. Here, the inactive "off" state is in rainbow colors, and the active "on" state is in light blue, and this figure is based on two structures of the beta2 adrenoceptor ((1) and (2)).

Saturday, October 24, 2015

A TRP down structure lane--Part 2

In my last post, I mentioned how highly homologous TRP channels respond in opposite ways to temperature changes. A clue to this paradox came later, when the structure of TRPA1 was published by the same lab as the TRPV1 structure(1). The previously discussed beta-bridges, or rather the segment that would form them if folded, is disordered in this structure. This is consistent with an "active", low-temperature state, which is also favored by disruption of the bridges--although as mentioned previously is controversial whether the specific human channel in this structure is cold-sensitive. In any case, given that highly homologous channels ARE unequivocally cold-activated, the necessary architecture should be present.

Lo and behold, the ARDs in the TRPA1 structure are oriented in a globally distinct fashion from in TRPV1, projecting perpendicularly to the membrane in a shape that I will call a "bullet" conformation--to distinguish it from the "pinwheel"--because of the combined shape of the ARDs plus the transmembrane domains.

Tuesday, October 13, 2015

A TRP Down Structure Lane--Part 1

How do we know when a room is hot or cold? The family of transient receptor potential (TRP) ion channels senses general stimuli such as temperature, reactive chemicals, and pH, which open a cation-conducting pore. In particular, one member called TRPV1 is a very important part of the ability to sense painful heat, TRPM8 senses cool temperature, and TRPA1 senses electrophilic chemicals. TRPA1 also has a role as a painful cold sensor in some species, though the generality of this to humans is controversial(1), and in some species it is in fact a sensor of warmth(2). These channels are also regulated by more conventional ligands, for instance TRPV1 is activated by capsaicin from chili peppers (explaining why they "taste hot"), while TRPM8 is activated by menthol in mint.

How does a protein sense such general stimuli as temperature and chemical reactivity? In the case of electrophiles and pH, the answer is straightforward--these modify the chemical structure of the protein at nucleophilic and acidic sites, respectively, in a similar way as to how a chemical ligand binds to a receptor. In particular, cysteine and lysine side chains are the nucleophilic sites in TRPA1(3). Temperature is trickier--but the simplest mechanism is that there is a structural element in the protein that is ordered below a given temperature, and disordered above it. Other mechanisms involve physical or chemical changes to the membrane lipids--in particular, certain oxidized lipids were proposed to mediate heat sensation by TRPV1(4), although this is questionable(5).

The structure of TRPV1,

Tuesday, October 6, 2015

Back after a long hiatus

I decided to start this blog up again as a fully-specialized blog about science, leaning strongly toward discussions of structural biology. When I started this over 8 years ago, I was just out of undergrad and wanted a place to discuss all sorts of ideas, not just about science but about the world in general. And since my science-brain was developed but my worldview-brain wasn't (well, in many ways it still isn't, but that's another story), it was very disorganized. I'm sure most people who came here to read about science didn't care about my possibly weird opinions on everything that was going on in the news, society, etc.

Then why did I decide to start blogging again at all? Well, I'm finding that as I read through the many published articles on structural biology, I often have hypotheses and hunches that I could never publish, and that pertain to fields so far apart in terms of the underlying biology that I couldn't possibly study all of them. I felt I needed a place to "air" these ideas, where anyone who wants to read them (maybe even some people who work on the relevant systems) can do so. 

While I sometimes have these ideas about other areas of science, I can't state them nearly as precisely as my hypotheses regarding protein structure. Therefore they're best suited to in-person discussions with me at a bar or something like that. Plus, being visual, I find that ideas about protein structure are more intuitive to describe, too.

An example of the why I want a blog like this--it turns out that the metal ion that I hypothesized here to trigger GTP hydrolysis in transcription factors has since been found to exist(1) (at least in eIF5, which is homologous to EF-Tu and EF-G), contrary to the preliminary data I reported on in my last post, and it's virtually exactly where I predicted it. It took an additional four years for it to be found, though, probably because the resolution of the ribosome-bound elongation factor structures was too low to observe it.

If you notice, I'm also moving to a more proper citation format. This post only has one, but upcoming posts may have many, and I want to make sure I'm properly referencing the publications on which I base my hypotheses.

(1)Kuhle B, Ficner R. A monovalent cation acts as structural and catalytic cofactor in translational GTPases. The EMBO journal. 33(21):2547-63. 2014. [pubmed]

Tuesday, November 23, 2010

Update on structures

First an update on the last one:
The structure of EF-Tu on the ribosome has now been solved with a non-hydrolyzable guanine nucleotide, in the true activated state. The metal ion that I guessed might be there is in fact not, but the ordered region of the effector loop is, in fact, ordered. This will hopefully lay to rest the idea that the elongation factors are activated by DISordering of their active site loops, which was always the biggest eyesore of all the models for their function so far. And the contacts with the ribosome are surprisingly simple, involving mainly just two histidines acting as a "ruler" to measure the orientation of the rRNA phosphate backbone. One is aligned in a phosphate-imidazole-water network vaguely reminiscent of the serine protease catalytic triad, explaining enhanced catalysis.

In addition, the number of G-protein coupled receptor structures has been steadily increasing. A friend asked me when I thought that solving new inactive GPCR structures would become boring, and while I'm sure this point will come sooner than anyone would predict at this time, we still have a ways to go. The family of seven-transmembrane receptors is very diverse, and there will no doubt be peculiarities to some of the more atypical members, like metabotropic glutamate receptors or even weirder ones like Frizzled. Yes, I know the last is not technically a GPCR, though it can be made into one quite easily by merely mutating loops, so I think of it as one. And that's not to mention the 100+ orphan receptors, whose ligand identification may be accelerated if we had structures.

Tuesday, June 15, 2010

A possible mechanism for elongation factors--finally?

Like the last speculation, this one involves the ribosome. In particular, the two elongation factors that in bacteria are known as EF-Tu and EF-G. Like all GTPases, these proteins have a catalytic domain that resembles the Ras family of signaling molecules. Clearly, however, the details of the catalysis are distinct. In common with each other, both types of proteins have low catalytic activity on their own (though the quantitative meaning of "low" varies"), and need to be activated by binding to something else in order to achieve rapid catalysis.

In the case of Ras-like proteins, these partners are called GTPase-activating proteins, or GAPs, and they bind to a particular region of the proteins near the phosphates of the bound GTP. Their activity commonly (though not always) involves an arginine side chain that is inserted into the active site and presumably stabilizes the transition state. However, EF-Tu and EF-G are activated by interacting with a particular state of the ribosome. The sarcin-ricin loop, or SRL, of the large subunit occupies a position similar to GAPs in Ras-like GTPases, but in the available structures it makes few contacts to the protein. This is probably due to the fact that one part of the protein, the so-called Switch I segment, is disordered.