A universal segment approach for the prediction of the activity coefficient.
MetadataShow full item record
This study comprised an investigation into solid-liquid equilibrium prediction, measurement and modelling for active pharmaceutical ingredients, and solvents, employed in the pharmaceutical industry. Available experimental data, new experimental data, and novel measuring techniques, as well as existing predictive thermodynamic activity coefficient model revisions, were investigated. Thereafter, and more centrally, a novel model for the prediction of activity coefficients, at solid-liquid equilibrium, which incorporates global optimization strategies in its training, is presented. The model draws from the segment interaction (via segment surface area), approach in solidliquid equilibrium modelling for molecules, and extends this concept to interactions between functional groups. Ultimately, a group-interaction predictive method is proposed that is based on the popular UNIFAC-type method (Fredenslund et al. 1975). The model is termed the Universal Segment Activity Coefficient (UNISAC) model. A detailed literature review was conducted, with respect to the application of the popular predictive models to solid-liquid phase equilibrium (SLE) problems, involving structurally complex solutes, using experimental data available in the literature (Moodley et al., 2016 (a)). This was undertaken to identify any practical and theoretical limitations in the available models. Activity coefficient predictions by the NRTL-SAC ((Chen and Song 2004), Chen and Crafts, 2006), UNIFAC (Fredenslund et al., 1975), modified UNIFAC (Dortmund) (Weidlich and Gmehling, 1987), COSMO-RS (OL) (Grensemann and Gmehling, 2005), and COSMOSAC (Lin and Sandler, 2002), were carried out, based on available group constants and sigma profiles, in order to evaluate the predictive capabilities of these models. The quality of the models is assessed, based on the percentage deviation between experimental data and model predictions. The NRTL-SAC model is found to provide the best replication of solubility rank, for the cases tested. It, however, was not as widely applicable as the majority of the other models tested, due to the lack of available model parameters in the literature. These results correspond to a comprehensive comparison conducted by Diedrichs and Gmehling (2011). After identifying the limitations of the existing predictive methods, the UNISAC model is proposed (Moodley et al, 2015 (b)). The predictive model was initially applied to solid-liquid systems containing a set of 18 structurally diverse, complex pharmaceuticals, in a variety of solvents, and compared to popular qualitative solubility prediction methods, such as NRTLSAC and the UNIFAC based methods. Furthermore, the Akaike Information Criterion (AIC) (Akaike, 1974) and Focused Information Criterion (FIC) (Claeskens and Hjort, 2003) were used to establish the relative quality of the solubility predictions. The AIC scores recommend the UNISAC model for over 90% of the test cases, while the FIC scores recommend UNISAC in over 75% of the test cases. The sensitivity of the UNISAC model parameters was highlighted during the initial testing phase, which indicated the need to employ a more rigorous method of determining parameters of the model, by optimization to the global minimum. It was decided that the Krill Herd algorithm optimization technique (Gandomi and Alavi, 2012), be employed to accomplish this. To verify the suitability of this decision, the algorithm was applied to phase stability (PS) and phase equilibrium calculations in non-reactive (PE) and reactive (rPE) systems, where global minimization of the total Gibbs energy is necessary. The results were compared to other methods from the literature (Moodley et al., 2015 (c)). The Krill Herd algorithm was found to reliably determine the desired global optima in PS, PE and rPE problems. The algorithm outperformed or matched all other methods considered for comparison, including swarm intelligence and genetic algorithms, with an average success rate of 89.5 %, and with an average number of function evaluations of 1406. The UNISAC model was then reviewed, and extended, to incorporate the significantly more detailed group fragmentation scheme of Moller et al. (2008), to improve the range of application of the model. New UNISAC segment group area parameters that were obtained by data fitting, using the Krill Herd Algorithm as an optimization tool, were calculated. This Extended UNISAC model was then used to predict SLE compositions, or temperatures, of a large volume of experimental binary and ternary system data, available in the literature, (over 4000 data points), and was compared to predictions by the UNIFAC-based and COSMO-based models (Moodley et al., 2016 (d)). The AIC scores suggest that the Extended UNISAC model is superior to the original UNIFAC, modified UNIFAC (Dortmund) (2013), COSMO-RS(OL), and COSMO-SAC models, with relative AIC scores of 1.95, 4.17, 2.17 and 2.09. In terms of percentage deviations alone between experimental and predicted values, the modified UNIFAC (Dortmund) model, and original UNIFAC models, proved superior at 21.03% and 29.03% respectively; however, the Extended UNISAC model was a close third at 32.99%. As a conservative measure to ensure that inter-correlation of the training set did not occur, previously unmeasured data was desired as a test set, to verify the ability of the Extended UNISAC model to estimate data outside of the training set. To accomplish this, SLE measurements were conducted for the systems diosgenin/ estriol/ prednisolone/ hydrocortisone/ betulin and estrone. These measurements were undertaken in over 10 diverse organic solvents, and water, at atmospheric pressure, within the temperature range 293.2-328.2 K, by employing combined digital thermal analysis and thermal gravimetric analysis, to determine compositions at saturation (Moodley et al., 2016 (e), Moodley et al., 2016 (f), Moodley et al., 2016 (g)). This previously unmeasured test set data was compared to predictions by the Extended UNISAC, UNIFAC-based and COSMO-based methods. It was found that the Extended UNISAC model can qualitatively predict the solubility in the systems measured (where applicable), comparably to the other popular methods tested. The desirable advantage is that the number of model parameters required to describe mixture activities is far lower than for the group contribution and COSMO-based methods. Future developments of the Extended UNISAC model were then considered, which included the preliminary testing of alternate combinatorial expressions, to better account for size-shape effects on the activity coefficient. The limitations of the Extended UNISAC model are also discussed.