Cellulase protein engineering towards improved thermostability

Contreras Leiva, Francisca; Schwaneberg, Ulrich (Thesis advisor); Elling, Lothar (Thesis advisor)

Aachen : RWTH Aachen University (2021, 2022)
Dissertation / PhD Thesis

Dissertation, RWTH Aachen University, 2021


Lignocellulosic biomass is a promising, abundant, and inexpensive raw material, which can be utilized in several industries such as pulp and paper, textiles, food, feed, and biofuels. Lignocellulosic biomass is a complex polymer composed of intertwined cellulose, hemicellulose, and lignin, being cellulose the major component. Cellulose is a crystalline unbranched polymer consisting of glucose monomers, and cellulose fibers interact with each other, composing a highly recalcitrant material, making its degradation complicated (e.g., high temperatures and pressure are required). Due to cellulose complexity, cellulose biodegradation is performed by the joint action of different cellulases: cellobiohydrolases, endo-β-1,4-glucanases, and β-glucosidases. The implementation of cellulases as industrial biocatalysts is subjected to their capability to withstand harsh conditions employed for lignocellulosic biomass degradation. Thermostable cellulases enable more cost-effective and sustainable processes, as they can contribute to the reduction of polluting chemicals utilized nowadays for cellulose degradation, representing an alternative green catalyst in, for example, biofuels production. Therefore, understanding cellulases' thermostability is of high importance for their application in lignocellulosic biomass degradation. In the last decades, protein engineering has become a powerful technology that contributes to the understanding of different protein properties such as thermostability. The main objective of this doctoral thesis was to expand the knowledge of structural elements that determine the thermostability in endoglucanases from glycosyl hydrolase family 5 and how these elements can be efficiently engineered for more efficient lignocellulosic biomass degradation. Accordingly, endo-β-1,4-glucanase Cel5A from Penicillium verruculosum was engineered towards improved thermostability. In this study, a robust high throughput system was established and validated for the selection of thermostable endoglucanases. Later, a KnowVolution protein engineering campaign was performed, and the C-terminus was identified as a critical structural determinant that improves Cel5A thermostability without interfering in the activity. Molecular reasons driving Cel5A thermostability determined by the C-terminus were studied. The influence of the C-terminus on Cel5A thermostability and activity was further studied by focused mutagenesis. Finally, the efficiency of Constraint Network Analysis as an in silico tool for the determination of positions with improved thermostability in Cel5A was analyzed. As key findings, the C-terminus (8th α-helix, comprising amino acid residues 280-314) was identified as a significant structural determinant, which harbors several substitutions that improve Cel5A thermostability. Variant Cel5A-R17 possesses three substitutions: F16L, Y293F, and Q289G. Compared to Cel5A wild type, variant R17 presents an improved melting temperature (7.7 °C), half-life at 75 °C (5.5-fold), and T50 (5.1 °C). Focused mutations in the C-terminus of the endoglucanase Cel5A confirm the influence of this protein section on thermostability. A high mutational load on the C-terminus does not lead to protein inactivation, and a new variant that improves Cel5A thermostability without hindering the activity was identified. Variant CE1 (E304V, L307M) presents an improvement of 2.1 °C in T50 and a 3.1-fold improvement in half-life at 75 °C. To accelerate enzyme engineering campaigns and reduce screening efforts, Constraint Network Analysis was used as a computational tool to identify positions that can improve the thermal stability of proteins. The utilization of Constraint Network Analysis in the identification phase of a KnowVolution campaign enabled the reduction of the experimental burden by 40%, compared with random mutagenesis. The best variant, CN5, consisting of substitutions T312R, T77G, and S308P, presented an improved Tm of 5.0 °C, compared to Cel5A wild type. In conclusion, the molecular knowledge that governs thermostability on endoglucanase Cel5A from the glycosyl hydrolase family 5 was obtained. Computational analyses revealed that the stabilization of the C-terminal region of Cel5A is responsible for improved thermostability, and this knowledge can likely be transferred to other hydrolases from the same family. Further, more efficient protein engineering towards improved thermostability can be accomplished through the incorporation of computational tools in protein engineering campaigns. The efficient design of thermostable cellulases promotes higher sustainability of biomass degradation by decreasing both energy and polluting chemical consumption, diminishing greenhouse gas production, which contributes to progress towards a sustainable economy.