1. Introduction
The petroleum is a mineral consisting of myriad hydrocarbons and sulfur, oxygen, nitrogen, and metal containing organic species with diverse composition, structure and molecular weight [
1,
2]. Although it is not considered as a renewable source its extinguishing has been deemed in the past to be finished nowadays. However, the continual discovering of new oil fields over the years continually move the expected time in the future when the petroleum crude as a mineral source is to be extinguished [
2]. In order to rationalize the petroleum crude usage, one needs to know its chemical nature. That is not an easy task because the crude oil is made up of a myriad of organic substances with different composition, structure, and molecular weight [
3,
4]. The distribution of the petroleum properties such as density, molecular weight, H/C ratio and others can be described by probability distribution functions [
5,
6,
7]. The processes employed to manufacture different petroleum derived products such as fuels, monomers, feeds for the organic synthesis, and polymers are found to strongly depend on the characteristics of the petroleum feedstock [
8,
9]. That is why the characterization of the petroleum crudes takes a pivotal place in many studies [
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25]. Many different techniques have been applied to investigate the chemical nature of the different crude oils explored around the world. Although the tremendous advance in the sophisticated analytical methods used to characterize the most difficult part of the petroleum crude: so called “bottom of the barrel”, and its most difficult to analyze and understand part—the asphaltenes or heptane/pentane insolubles the mutual relations of the different compounds making up the petroleum are still not well understood [
26].
Petroleum is not a uniform crude material. Each petroleum crude produced in the world has a sui generis chemical constitution, which fluctuates depending on the manner of its formation. Typically, more than 150 crude grades are traded [
27] worldwide, and many of these traded crude oils are obtained by blending petroleum crudes from two or more fields. There is a very big variation in the properties of the petroleum crudes around the world which will be discussed later in this review article. Oil refining is a process of transforming of crude oil into marketable products such as fuels, lubricants, and light olefins-feeds for the petrochemical industry. The stringent environmental regulations, the strict refining product specifications along with the limited availability of light sweet crude oils, and the volatile refining margins shrank the number of the operating refineries worldwide from about 750 to 643 for a period of 12 years [
8]. The shift of product demand from fuels to light olefins is another challenge the oil refining is facing [
28]. All changes occurring in the petroleum refining a result from the ever-increasing stringent environmental regulations, new tougher product specifications, the searched outlet for processing of waste and biomaterials [
29] together with the natural requirement for any business process to be profitable demand adopting innovative techniques. The purchasing of crude oil is the largest cost in oil refining (
Figure 1) and its share among all refining costs has increased from 80% [
30] up to 95% [
31] over the years. Therefore, processing of cheaper crude oils, also called “opportunity crudes” (OCs), or “advantages oils” could improve the oil refining profitability [
2]. However, the OCs are cheaper because of their lower quality and their processing presents extraordinary technical challenges [
2]. The OCs contain higher amount of harmful for the refining processes compounds such as asphaltenes, metal-bearing species, and naphthenic acids, which jeopardize the smooth operation of the refinery [
2]. Their physical properties and reactivity have severe consequences in oil refining [
2]. A lack of comprehensive knowledge of these molecules may cause an unplanned shut down for cleaning or repairing of refining process equipment leading to a huge loss of profit opportunity when OCs are processed. In order to comprehend better the features of OCs and their impact on refining process performance investigations are needed which, unfortunately are expensive. A comprehensive laboratory crude oil assay may cost in excess of 20,000 USD [
32].
Only a small number of major oil companies have the resources to produce in-house library of laboratory assays for crude oils around the world [
32]. It is difficult even with these laboratory crude oil assays in hands to properly predict the behavior of the OCs during their processing because the oil refineries typically process not a single crude but a blend of crudes [
33]. For that reason, properly modeling and prediction of petroleum blend properties is required [
34,
35]. In this paper we present a comprehensive review of the methods reported in the literature to study the chemistry and technology of petroleum and the challenges researchers overcome during doing this job with the aim to find out the most efficient way to process the different grades of available crude oils around the world and produce products needed to satisfy mankind needs.
2. Petroleum Characterization by Means of Crude Oil Assay
The petroleum characterization by means of crude oil assay consists of performance of several laboratory tests to measure physical and chemical properties of whole crudes and several distilled fractions [
11,
36,
37,
38,
39,
40,
41]. Usually, the boiling ranges of distilled fractions coincide with those of commercial fuels, produced in the refineries [
36]. The crude oil assays can be classified as comprehensive (full) and short, or inspection assays. The full assays are ample and particularly significant for new crudes. In a crude assay, analyses are carried out by combination of atmospheric and vacuum distillation tests to generate a true boiling-point (TBP) distillation data. A full assay is obtained from a series of physical and chemical tests that give a precise representation of petroleum quality and petroleum product quality and enable an instruction for its behavior during refining, transportation, or storage [
11]. At minimum, the assay should contain a distillation curve, typically a TBP curve, and a specific gravity curve. A typical analysis scheme of petroleum characterization by a comprehensive crude oil assay (full assay) is presented in
Figure 2.
The comprehensive assay is complex, costly, and time-consuming and is normally performed only when a new field comes on stream for which a company has an equity interest, a crude that has not previously been processed arrives at a refinery, or when the inspection assay indicates that significant changes in the stream’s composition have occurred [
37]. The typical comprehensive assays of extra light (API > 40), light (30 < API < 40), medium (20 < API < 30), heavy (10 < API < 20) and heavy-extra heavy crude oils (API ≈ 10) from all over the world are presented in
Table S1. It is interesting to note here that crude oils with close density have very different fraction distributions and properties of its distillated derivatives (for example see
Table S1: Cinta from Indonesia and Istmust from Mexico; Wilmington from USA and Eocene from Kuwait). Petroleum properties depend on its maturity and geographic region of origin. Petroleum having a gravity below 20°API (heavy crude oils with SG above 0.934) are immature, and those having more than 20°API are mature [
42]. In addition, crude oil sulfur content depends on source origin and maturity. Crude oils originating from marine source have sulfur content below 1%, and those originated from non-marine origin have sulfur content above 1% [
43]. Crude oil maturity increases with the decrease of sulfur content [
44]. The typical minimum assays of extra light, light, medium, and heavy crude oils are presented in
Tables S1–S3. The data in
Tables S1–S3 includes TBP curve data, and density and sulfur distribution of 37 extra light, light, medium, and heavy crude oils. The cost for performance of a comprehensive crude oil assay as that shown in
Table S1 amounts of 11,000 Euro, while that of the short, or so-called inspection assay [
37] as that shown in
Tables S1–S3 amounts of 5600 Euro (cost level of 2022). The variation of petroleum bulk properties extracted from a great number of available crude oil assays (more than 4000), and from literature sources is summarized in
Table 1.
The data in
Table 1 indicates that the bulk properties of petroleum crudes can vary in a very wide range. For example, the minimum density of petroleum crude of 0.746 g/cm
3 [
45] corresponds to the density of medium naphtha (fr. 100–150 °C) as evident from the data in
Table S1. Whereas the maximum density of petroleum crudes of 1.119 g/cm
3 is very close to the density of the heaviest and most polar fraction of the crude oil—asphaltenes [
48]. The minimum molecular weight of petroleum crude of 117 g/mol. is within the range of the naphtha fraction [
49], while the maximum molecular weight of petroleum crude of 652 g/mol. is within the range of the vacuum residue fraction [
26,
50]. The range of variation of the properties between the different crude oils can be so wide that in some cases the same property cannot be measured by the use of a single standard method (nitrogen, metals, viscosity, density, etc.) and different approaches are searched to obtain reliable values for the same petroleum property [
51,
52,
53]. The petroleum maximum content of C
7-asphaltenes of 43.0% [
46] is much higher than that of the heaviest, the highest boiling point fraction—vacuum residue of the crude oils displayed in
Table S1. The C
7-asphaltenes are concentrated in the crude oil vacuum residue. The distilled fractions usually do not exhibit presence of C
7-asphaltenes, and therefore one may expect that the crude oil asphaltene content would be equal to the multiplication of the petroleum vacuum residue content and the vacuum residue asphaltene content. However, our recent study indicated that asphaltenes as solubility class petroleum components do not always obey the additive rule [
54]. In order to test whether the crude oil asphaltene content can be predicted from the vacuum residue asphaltene content and TBP yield we performed analyses of asphaltene content in both crude oils and derived thereof vacuum residues. We also measured asphaltene content in the lighter than vacuum residue, vacuum gas oil fractions and detected no asphaltene content therein. No published reports were found examining the application of the additive rule to predict crude oil asphaltene content from that of the vacuum residue multiplied by the vacuum residue TBP yield.
Table 2 presents data of content of C
5-, and C
7-asphaltenes in 28 crude oils and their vacuum residue fractions and juxtaposes the measured against estimated asphaltene contents using the additive rule (multiplication of the petroleum vacuum residue content by the vacuum residue asphaltene content).
Figure 3 exhibits graphs of measured versus estimated C
7- and C
5-asphaltene contents using the additive rule. From this data it could be deduced that the crude asphaltene content could be predicted from the TBP vacuum residue yield multiplied by its asphaltene content. The data in
Figure 3 also shows that the measured asphaltene content is in general higher than that of the estimated one, probably because of the poorer solubility power of the crude oil maltene fraction, than that of the vacuum residue maltene fraction. Another interesting finding taken from the data in
Table 2 concerning the relation of C
5- to C
7-asphaltenes is depicted in
Figure 4. It is evident from this data that the ratio between C
5- and C
7-asphaltenes in both parent crude oils and the derived thereof vacuum residue fractions is the same. This data also shows that if information of the content of C
7-asphaltenes is available then the content of C
5-asphaltenes can be computed from the regression equation shown in
Figure 4. Furthermore, the difference between C
5-
, and C
7-asphaltenes determines the n-heptane soluble, n-pentane insoluble, that is the n-pentane insoluble resins. In this way in case of available information about saturate content in the crude oil, and C
7-, or C
5-asphaltenes, the full SARA (saturates, aromatics, resins, asphaltenes) composition can be obtained.
The true boiling point (TBP) distillation analysis is the heart of the petroleum crude characterization process because it provides information about the quantity of the different fractions, which are treated in the different refinery plants such as isomerization, reforming, hydrotreatment, fluid catalytic cracking, hydrocracking, etc. [
55,
56,
57,
58,
59,
60,
61,
62,
63,
64,
65,
66,
67,
68,
69,
70,
71,
72,
73,
74,
75,
76,
77,
78,
79,
80,
81,
82,
83,
84,
85,
86,
87,
88,
89,
90]. It also provides quantity of the different petroleum fractions to be further analyzed for their characteristics (
Figure 2) [
11,
36]. Typically, the crude oil TBP analysis is performed in accordance with the standards ASTM D 2892 (crude oil distillation under atmospheric pressure) and ASTM D 5236 (atmospheric residue distillation under vacuum). The TBP analysis of a crude oil is usually performed for three days. Two days lasts the ASTM D 2892 analysis, and a day is required to complete the ASTM D 5236 analysis. Graphs of the TBP distillation curves of extra light (CPC), light (Azeri Light), medium (Urals), and heavy (RasGharib) crude oils are depicted in
Figure 5.
One can see from the data in
Figure 5 that the points of the distillations ASTM D 2892 and ASTM D 5236 do not lie exactly at the same curve. A shift to higher boiling points at the transition of ASTM D 2892 to ASTM D 5236 is observed. This leads to appearance of a gap in the content of the fraction 360–380 °C that is the first fraction of the ASTM D5236 vacuum distillation, as observed in the data in
Table S2. It is difficult to believe that such a gap can physically exist in a continuous distillation curve. Some researchers have employed different interpolation functions to fit distillation curves of petroleum fluids [
6,
7,
17,
55,
61,
62,
87,
88,
89,
90]. Sanchez et al. [
61] concluded that Weibull extreme, Weibull, and Kumaraswamy probability distribution functions are recommended for fitting distillation data. Behrenbruch, and Dedigama [
62] employed a two-parameter form of gamma distribution function to fit TBP distillation of 24 extra light, light, medium, and heavy crude oils. Xavier et al. [
88] studied TBP distillation data (ASTM D 5307 and ASTM D 2892) of 41 Brazilian crude oils and availed two parameters Beta, Gamma, Riazi, Weibull and four parametric Weibull extreme distribution functions. Their research confirms the conclusion made by Sanchez et al. [
61] that the Weibull extreme model presents the best performance within the models in terms of correlation coefficient and root mean squared error. Kotzakoulakis and George [
90] used Riazi’s distribution model to fit TBP distillation curve of crude oils. Hosseinifar, and Shahverdi [
17,
55,
89] employed third order polynomial to fit petroleum fluid distillation data. They reported in their recent study [
89] that the third order polynomial outperformed the probability distribution functions Beta, Gamma, Riazi, Weibull and Weibull extreme. In our review article we tested the probability distribution functions: five parametric Weibull (Equation (1)), six parametric Weibull extreme (Equation (2)), four parametric gamma (Equation (3)), four parametric beta (Equation (4)), Riazi’s distribution model (Equation (5)), and the third polynomial model as described by Hosseinifar, and Shahverdi in [
17,
55,
89] (Equation (6)).
Here Γ(x) is the gamma function, Γ(a,x) is the incomplete gamma function, B(a,b) is the beta function, B(x,a,b) is the incomplete beta function. In addition, a
1, a
2, a
3, a
4, a
5, a
6 are parameters which must be statistically fitted to experimental data.
These mathematical functions were used to simulate the TBP boiling point distributions of the crude oils from
Table S2.
Table 3 summarizes the average absolute error of prediction of evaporate yield at different boiling points for the studied crude oils. It is evident from this data that the order of increasing error in predicting the evaporate yield at a definite boiling point is following: Weibul (%AAD = 7.11) > Third polynomial (%AAD = 3.44) > Riazi’s distribution model (%AAD = 0.73) > Gamma (%AAD = 0.65) > Beta (%AAD = 0.58) > Weibull extreme (%AAD = 0.51). Thus, the six parameter Weibull extreme function outperforms the other boiling point distribution models, and our study is in line with the conclusions made by Sanchez et al. [
61], and Xavier et al. [
88]. In addition to the use of probability distribution functions help fill the gap between the last ASTM D2892 and the first ASTM D5236 fractions as indicated in the data of
Table 4. In this way, the drawback of combining both distillation data ASTM D2892 and ASTM D5236 during construction of the crude oil TBP curve can be overcome using the probability distribution functions. The model distribution functions approximating TBP curves can be also used to verify the correctness of the performed crude TBP analysis.
Table 5 presents data of TBP fraction yields of Azeri Light crude oil measured in two different laboratories and estimated refinery margins using the refinery LP model (Honeywell Refinery and Petro-chemistry modeling System (RPMS)) and employing the TBP distillation yields measured in both laboratories. The two TBP fraction yields exhibited a refinery margin difference of 2 million USD/month confusing the refinery management which TBP curve data to use in the process of refinery production planning. As apparent from the data in
Table 3 Lab. 1 exhibits a lower error in fitting the TBP distillation data to the model functions than the data from Lab.2 suggesting that the Lab1 data must be more correct. An erroneous TBP analysis may lead to underestimation or overestimation the refinery margin from processing of a particular crude oil and may result in an erroneous crude selection. Thus, the correctness of the TBP analysis is vital for the proper refinery production planning. Unfortunately, the TBP analysis takes too much time and its substitution with other faster distillation methods has been investigated in several studies being the simulated distillation the best candidate [
56,
69,
74,
77,
78]. However, as evident from the data presented in
Figure 6 there is a different pattern between TBP, and simulated distillation (ASTM D7169) for the different crude oils. The heavy-extra heavy crude oils (Boscan and Albanian) were incapable of analyzing the TBP distillation by ASTM D2892. Instead, they were capable of analyzing by the physical vacuum distillation in accordance with the ASTM D 1160 standard that is considered equivalent to TBP.
The data in
Figure 6 exhibits that the high temperature simulated distillation (HTSD, ASTM D7169) reports lower yields for the fractions boiling below 360 °C (the atmospheric part of the TBP—ASTM D 2892) and higher yields for the fractions boiling above 360 °C (the vacuum part of the TBP—ASTM D 5236) for the heavy-extra heavy, heavy, and medium crude oils.
The light, and the extra light crude oils, however, indicate a good coincidence between HTSD and ASTM D 5236, and the typical lower HTSD yields for the fractions boiling below 360 °C. There is still need of additional investigations for development of a reliable method that allows converting the simulated distillation data into TBP data for all crude oil types.
3. Petroleum Characterization by Means of SARA Analysis
Petroleum is a complex mixture of hydrocarbon and non-hydrocarbon components with carbon numbers from 1 to over 100 atoms and boiling points from −161.60 C (methane) to over 760 °C. Using distribution models of TBP of different types of petroleum [
5], it has been found that their final boiling temperatures can vary between 1000 and 2000 °C. At a carbon atom number of 25 (boiling point = 402 °C) the number of possible isomers of acyclic alkane isomers amounts to 36.7 × 106, while at a carbon atom number of 100 (boiling point = 708 °C) the number of possible isomers of acyclic alkane isomers amounts to 5920 × 1036 [
91]. The actual number of components that go into the composition of a crude oil is unknown, but it is assumed to exceed 10
6 [
91,
92]. For example, in the heaviest fraction of oil boiling above 540 °C (vacuum residue) only 5% of its constituent components are known, the remaining 95% are unknown to mankind [
93]. Analyzing such a complex mixture is a challenge for specialists working in the petroleum, refining and petrochemical industries. Therefore, a separation of the complex petroleum mixture into fractions based on their chemical similarity can facilitate the process of petroleum chemical characterization and petroleum chemistry understanding. The separation of petroleum into saturates, aromatics, resins and asphaltene fractions (SARA) is carried out on the basis of the polarity of these fractions by using different solvents, eluents and adsorbents. The first step in SARA fractionation is the precipitation of the asphaltenes from the oil mixture using n-heptane [
92,
94] or n-pentane [
95]. The de-asphalted petroleum mixture can be further examined for the content of saturates, aromatics, and resinous fractions by ASTM D2007, ASTM D4124, high performance liquid chromatography (HPLC), or thin layer liquid chromatography coupled to a flame ionization detector (TLC-FID; IATROSCAN) methods. Significant differences have been reported between group hydrocarbon composition of oil (SARA) results obtained by ASTM, HPLC or TLC-FID methods [
96,
97,
98]. The difference in the methods, the nature of the eluent, and the molecular weight of the solvents significantly affect the relative proportions of each fraction [
46,
95]. SARA results of oil composition have been used by a number of researchers to predict various oil properties, coke formation, stability of asphaltenes in oil, and others [
46,
99,
100,
101]. The group hydrocarbon composition of petroleum is used as input information in a number of equations of state, forming the basis of thermodynamic models predicting sediment formation in the process of petroleum extraction and refining [
102,
103,
104,
105,
106,
107]. The group hydrocarbon composition of feedstock for conversion to lighter high-value products in a range of conversion processes has been shown to carry valuable information about the operation of these processes [
108,
109,
110,
111,
112,
113,
114,
115,
116,
117]. Therefore, the investigation of methods for measuring SARA composition of petroleum and petroleum derivatives is of considerable interest for practice. Moreover, different researchers apply different analytical techniques to measure the SARA composition of oil and oil derivatives and often their comparison is impossible. In this review article we summarize a large body of data reported in the literature on the group hydrocarbon composition of extra light, light, medium, heavy, extra heavy oil types, oil sands and natural bitumen obtained from different methods and search for relationships between SARA composition, physicochemical properties of oil and different indices reported in the literature characterizing the oil behavior during the extraction and refining process. In this case, 308 samples of crude oil types of representative of extra light (specific gravity (SG) < 0.8017), light (0.8017 < SG < 0.855), medium (0.8600 < SG < 0.9220), heavy (0.9220 < SG < 1.000), and ultra-heavy (SG > 1.000) were analyzed for SARA composition in 16 literature sources [
13,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30].
Table S5 summarizes the group composition data (SARA), the aromatic structure content (sum of all fractions that contain aromatic carbon. These are aromatic, resinous and asphaltene components) and specific gravity of the 308 petroleum samples. Five methods are mentioned as used for the SARA analysis of crude oil composition and their percentage distribution for the data base of 318 crude oils extracted from the literature sources [
118,
119,
120,
121,
122,
123,
124,
125,
126,
127,
128,
129,
130,
131,
132,
133,
134] is presented in
Table 6.
The data in
Table 6 shows that the HPLC method has found the most widespread application for SARA petroleum composition analysis. It has been applied to the analysis of the group hydrocarbon composition of all groups of petroleum: extra light, light, medium, heavy, extra heavy petroleum types and oil sands and natural bitumens. The next most common method is the ASTM D2007 method, also applied to all crude oil types. Next, liquid chromatography method follows. It has also found application for the determination of SARA composition of all crude oil groups. The thin layer chromatography TLC-FID (Iatroscan) takes the fourth place of SARA method application. It has found application for SARA analysis of light, medium and heavy crude oil types. The last place method in terms of applicability is the modified ASTM D4124 method. It has also found application for SARA analysis of light, medium and heavy grades of crude oil.
In order to investigate the presence of statistically meaningful relations between the SARA fractions measured by the five methods used for oil group hydrocarbon composition analysis, intercriteria analysis (ICrA) evaluation for all methods was performed. The results of ICrA evaluations are summarized in
Tables S6–S15. The ICrA approach has been applied in several our studies to search for statistically meaningful relations in petroleum characterization and refining [
9]. More detailed explanation of the essence of ICrA and its application in petroleum refining the reader can find in our recent study [
54]. Here, we summarize the meaning of the values of positive and negative consonance (μ) and (υ) applied in ICrA evaluation of the studied petroleum property relations. The meaning of μ = 0.75 ÷ 1.00; υ = 0 ÷ 0.25 denotes a statistically meaningful significant positive relation, where the strong positive consonance exhibits values of μ = 0.95 ÷ 1.00; υ = 0 ÷ 0.05, and the weak positive consonance exhibits values of μ = 0.75 ÷ 0.85; υ = 0.25 ÷ 0.15. Respectively, the values of negative consonance with μ = 0.00 ÷ 0.25; υ = 0.75 ÷ 1.00 means a statistically meaningful negative relation, where the strong negative consonance exhibits values of μ = 0.00 ÷ 0.05; υ = 0.95 ÷ 1.00, and the weak negative consonance exhibits values of μ = 0.15 ÷ 0.25; υ = 0.75 ÷ 0.85. From the data presented in
Tables S6–S15, it can be seen that all the methods for SARA analysis of crude oil except thin layer chromatography (correlation analysis) show statistically significant relationship between saturate components and aromatic structure contents with the specific gravity of crude oil. The ASTM D 2007, ASTM D 4124 and high-performance liquid chromatography methods show a moderate statistically significant relationship between the saturate components and aromatic structure content and the specific gravity of the crude oil, and the liquid chromatography method demonstrates a strong statistically significant relationship between these crude oil parameters. A similar relationship between specific gravity and the content of saturate components and aromatic structures was found in our recent study for vacuum residues derived from all groups of oil types [
135]. From this it can be concluded that the specific gravity of crude oil and petroleum derivatives is a major indicator characterizing the chemical nature of oil. By using the nonlinear least squares method, a mathematical relationship was derived relating the specific gravity of crude oil to the content of aromatic structures. This relationship is shown by Equation (7) and in
Figure 7.
where, ARO str. = Aromatic structures content, %wt; SG = specific gravity.
The average absolute deviation of the values predicted by equation 7 for the aromatic structure content of the crude oils from those of the measured values, excluding the data from the Iatroscan analysis, is 7.5%. Our previous studies [
135] have shown that when the deviation of the predicted from the measured values of the aromatic structure content is greater than the mean absolute deviation this suggests an incorrect analysis and it should be repeated. Not obeying this dependence for SARA analysis data performed by the thin layer chromatography (TLC, Iatroscan) method indicates disagreement of this analysis method with the other methods, a fact already commented in other studies expressing distrust in the results of this analysis due to the large errors made when the same sample is analyzed repeatedly [
26]. The reason for the inaccurate SARA analysis by thin layer chromatography (TLC, Iatroscan) has been attributed to the problem of highly adsorbed asphaltenes not migrating along the chromatographic rod [
136].
Our analysis of SARA composition data of extra light, light, medium, heavy, extra-heavy crude oil types, oil sands and natural bitumen published in the literature, as well as our own investigations with different methods to determine the SARA composition of vacuum residues of different crude oil types confirm the significant deviation of the SARA fraction relationships observed between the four SARA analysis methods (ASTM D 2007, ASTM D 4124, HPLC, and liquid chromatography (LC)) and the Iatroscan method [
26]. From this it can be concluded that the results of thin layer chromatography are incomparable to the other analysis methods and that this analysis method has a high uncertainty according to the studies of Youtcheff [
136]. It can be seen from the data in
Tables S6–S15 that the relative proportion of resin-asphaltene components increases with increasing aromatic content and crude oil density. In other words, the increase in specific gravity is associated not only with an increase in aromatic structures content in the crude oil, but also with an increase in the relative proportion of resin-asphaltenic components at the expense of a decrease in the relative proportion of aromatic components. It can also be noted that an increase in the content of saturate components is statistically significantly correlated with an increase in the value of the index of colloidal instability, suggesting that crude oil types that have a higher content of saturate components may be colloidally unstable and contribute to an increase in the rate of sedimentation during production and processing. This fact is also confirmed in the studies of Xiong et al. [
129], reporting that sedimentation problems in the oil recovery process occur significantly more frequently in light types than in other oil types. Unfortunately, the crude oil SARA method applied to lighter crude oils is associated with a mass balance inaccuracy as reported by Hemmingsen [
134] and illustrated in
Table 7This shortcoming could be overcome by the use of prediction methods as that reported by Yarranton [
137]. He reported a higher accuracy of SARA composition prediction based on crude oil high temperature simulated distillation, and asphaltene content, than the typical crude oil SARA analysis. This may be well applied to lighter crude oils where the losses, as evident from the data in
Table 7, are quite high. The SARA analysis of a crude oil could be reconstituted by the use of developed correlations to predict the content of aromatic structures, or saturates, which is the difference 100—content of aromatic structures, in the crude oil fractions having information for fraction boiling point and density (specific gravity). Such correlations are summarized below.
The calculation of the saturate content of the vacuum residue fraction (>550 °C) can be performed following the model developed in [
135]. It is given as Equation (8).
The calculation of the saturate content of the vacuum gas oil fraction (360–550 °C) can be carried out following the model developed in [
138]. It is given as Equation (9).
The aromatic ring index (ARI) is calculated by the correlation of Abutaqiya et al. [
139,
140] as shown in Equation (10).
The molecular weight of oil fractions can be calculated by the correlation of Goosens [
49] as shown in Equation (11)
The refractive index of oil fractions can be estimated by the correlation of Stratiev et al. [
141] as shown in Equation (12):
The function of refractive index can be estimated by Equation (13) [
139,
140].
The saturate content in diesel fraction (240–360 °C) can be calculated based on the correlation developed in our earlier study [
142] and shown as Equation (14):
The saturate content in kerosene fraction (180–240 °C) can be calculated based on the correlation developed in our earlier study [
142] and shown as Equation (15).
Unfortunately, the saturate content of naphtha fraction (IBP-180 °C) does not correlate with density, and boiling point in contrast to the other crude oil fractions. Investigating the relation of properties of naphtha fractions derived from 244 different crude oils it was found that the content aromatics, or 100—aromatics = saturates does not correlate with any studied property as shown in the data of
Table 8 and
Table 9.
This is very well illustrated with the data shown in a correlation matrix of naphtha fraction properties of 244 crude oil.
The data in
Table 8 and
Table 9 indicates that only research octane number (RON) of naphtha statistically meaningful correlates with the specific gravity. Therefore, to reconstruct the crude oil SARA composition from data of density and boiling point of the crude oil fractions, requires also information about the saturate content (100-aromatics) in the naphtha fraction, that can be obtained by GC PIANO analysis. By following such a reconstitution method, and employing GC PIANO analysis of naphtha fraction can be avoided the errors in crude oil SARA composition in cases where the SARA mass balance is poor.
4. Advanced Techniques in Petroleum Characterization
Detailed molecular characterization of crude oil remains a challenge for analytical chemists because of its inherently complex nature expressed by enormous number compounds larger than the number of genes in the human genome [
143,
144]. The accessible detailed molecular-level compositional information of petroleum can be obtained by the sophisticated techniques as mass spectrometry, comprehensive gas chromatography, and hybrid analytical platforms [
10,
145,
146,
147,
148,
149,
150]. A significant amount of research has been given and announced on the characterization of crude oil, coal liquefaction products, and bitumens [
151,
152,
153,
154,
155,
156,
157,
158,
159,
160,
161] and a huge number of high boiling crude oil molecules were mass resolved and formula identified. Marshall et al. have postulated a special term “Petroleomics” for this field of research [
143,
162,
163].
Mass spectrometry has played a critical role in the characterization of petroleum. High field Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometers are now common in petroleum R&D centers [
10]. The usage of a 21 Tesla Fourier transform ion cyclotron resonance mass spectrometer (21 T FT-ICR MS) has shown to be capable of producing unmatched breadth and depth of compositional information lately [
145]. Moreover, the combination of mass spectrometry and chromatography, including gas chromatography (GC), 2D GC (GC×GC), high-performance liquid chromatography (HPLC), and gel permeation chromatography (GPC), has enabled a greater understanding on the types and amounts of certain chemical classes in crude oil. Other general techniques employed for organic compounds such as NMR (nuclear magnetic resonance), IR (infrared spectroscopy), Raman, Terahertz, UV–Vis (ultraviolet–visible spectroscopy), and X-ray diffraction clash with the broad range of compounds found in petroleum, generating inapprehensibly complex data [
146,
164].
Figure 8 summarizes the spectroscopic methods for petroleum properties characterization presented by Correa Pabon and Souza Filho in [
164]. Separating petroleum into different fractions prior to crude oil analysis has been demonstrated to be critically important [
144,
146,
162].
Petroleum fractionation helps to improve the dynamic range of the mass spectrometry analysis, molecular structural assignments and simplifies species quantitation by assuming uniform ionization of chemical species with similar structures in the same fraction that have similar ionization efficiency [
162]. Certainly, petroleum assay without fractionation is desirable because it is faster and cheaper, but only electron ionization can ionize all crude oil compound classes [
165]. In addition, the electron ionization causes extensive fragmentation to many of the compounds in crude oil. All other ionization methods apply to specific classes of compounds.
Different separation approaches have been developed by analytical laboratories with aim to achieve more detailed fractionation of petroleum [
158,
166,
167,
168,
169,
170,
171,
172,
173,
174,
175,
176,
177,
178,
179,
180,
181]. Main separation techniques are chromatography, fractionation (distillation), and solubility-based separations. The commonly used approach for petroleum fractionation into distinct chemical/solubility types is a separation based on saturates, aromatics, resin, and asphaltenes by high-performance liquid chromatography [
165,
166]. Bissada et al. applied an automated multi-dimensional high-performance liquid chromatography [
166,
167]. Another separation approach for petroleum separation was proposed by Robson et al. [
168] applying a petroleum separation in strong cation-exchange (SCX) solid phase extraction (SPE) cartridge. Robbins et al. proposed a separation method based on ion exchange and normal phase chromatography (HPLC-2) [
169]. The newest improvement in high-performance liquid chromatography petroleum separation is developed by Putman et al. [
170] and includes a dual-column aromatic ring class separation (HPLC-3). The separation of saturated and aromatic hydrocarbons as well as aromatic/aliphatic sulfides is improved and optimized to 1-, 2-, 3-, 4-, and 5+-ring constituents.
A new approach called distillation precipitation fractionation mass spectrometry (DPF MS) for molecular profiling of crude oil and condensates was developed by Yerabolu et al. [
174] and Alzarieni et al. [
176]. This method is based on segregating petroleum into six distinct fractions containing different types of components (involved initial distillation of vaporous components followed by precipitation of asphaltenes) and then optimizing the ionization method and mass spectrometry technique for the analysis of each fraction. DPF MS method gave weight percentage of each fraction and accurate average molecular weight of the crude oil was derived by combining the average molecular weight and the mass of each fraction [
144,
174,
176].
Selective separation of hetero-organic hydrocarbons from petroleum as naphthenic acids, nickel and vanadyl petroprophrins, nitrogen and sulfur compounds is another field that deserves research attention. Generally, two methods for naphthenic acid separation from crude oil are reported in the open literature: liquid-liquid extraction [
182,
183,
184,
185,
186,
187] and solid-phase extraction [
188,
189,
190,
191,
192,
193,
194,
195,
196,
197,
198,
199,
200,
201,
202]. Irrespective of its operationally and relatively low cost, the employment of this process includes a large volume of solvents, and its efficiency can be detracted in samples with low analyte concentration because of amenability to form stable emulsions [
192]. More selective methodologies, that minimize solvents, have been reported [
192,
193]. Method for tetraprotic acids isolation from petroleum by the use of molecularly imprinted polymers demonstrated a detection of very low concentrations (subparts per million) [
193].
Selective isolations of various sulfur types, such as sulfides, thiophenes, and mercaptans is presented in several research [
194,
195,
196]. Although quite powerful techniques have resulted and are available today for routine use as oxidation, methylation, reduction and liquid-liquid extraction, it is still not possible to obtain clean class separations of all sulfur functional groups [
49]. For example, disulfides separation remains an unsolved problem, and no reliable separation principle exists for their analysis in fossil materials [
194].
Metal petroprophrins isolation is another active area of research [
199,
200,
201]. Qian et al. developed a cyclograph separation scheme for nickel porphyrins fractionation [
200]. In addition, a combination of solvent extraction, silica gel, and aluminum column separations is proposed in [
201] that allowed isolation of ultrahigh purity vanadyl petroprophrins.
Ultrahigh resolution and accuracy mass spectrometry is a wide used tool to characterize crude oils and its derivatives [
144,
146,
202,
203,
204,
205,
206,
207]. FT-MS can be performed in ion cyclotron resonance (ICR) [
144,
146,
204,
206,
207,
208,
209,
210,
211,
212], orbitrap mass spectrometers [
202,
213,
214,
215], and additional non-Fourier transform techniques, such as high-resolution time-of-flight mass spectrometry (TOF-MS) [
216,
217]. FT-ICR MS is the state-of-the-art for determination of elemental compositions of compounds in complex mixtures as petroleum due to its ultra-high resolving power and accuracy. The basic principles of FT-ICR MS instrumentation, ionization techniques and data interpretation were reviewed in [
144,
146,
218]. Electrospray ionization (ESI) is the most common ionization technique that has been used which is suitable for the gentle ionization of the polar (acidic and basic) components. Negative ESI (−) selectively ionizes acids, phenols, and non-basic nitrogen compounds, while positive ESI(+) selectively ionizes basic nitrogen compounds. ESI(−) FT-MS has been applied in different cases, such as to characterize polar heteroatomic components before and after hydro-treatments, naphthenic acids, and heavy oil distillation cuts [
183,
185,
219,
220,
221]. In addition, ESI (−) 7.2 T FT-ICR MS has been employed to evaluate the crude oil biodegradation level by profiling the changes in relative abundances of compounds in the heteroatom classes, in particular, oxygen containing compounds [
212,
213]. Guricza et al. applied ESI (−) FT-MS for measurement of non-polar polyaromatic hydrocarbons and polyaromatic heterocycles in heavy crude oil asphaltenes [
222]. Other ionization methods that have been found to provide complementary information on the less polar or more volatile constituents of petroleum are atmospheric pressure techniques. For example, atmospheric pressure photoionization (APPI), atmospheric pressure chemical ionization (APCI), atmospheric pressure laser ionization (APLI) and laser desorption ionization (LDI) has been used to better characterize the aromatic hydrocarbons and asphaltenes in crude oil [
154,
205,
223,
224,
225,
226]. APPI outperforms other ionization methods (such as APCI) as it generates mostly odd-electron molecule ions for aromatic molecules without fragmentation.
FT-ICR MS and Orbitrap-MS were compared as tools for the study of nonvolatile crude oil fraction composition by Vanini et al. [
227]. Both techniques enabled the elemental composition to be derived by precise mass measurements, and the most plentiful components were mainly N, N2, O3, O1, O2, NO2, NS, NOS, and OS classes. FT-ICR MS facilitated the finding of higher molecular weight analytes in relation to Orbitrap-MS [
227].
Ionization of large, saturated hydrocarbon is the greatest challenge in petroleum assay by MS techniques. These petroleum constituents are not volatile and do not contain easily ionizable functional groups. ESI, matrix-assisted laser desorption/ionization (MALDI), field desorption, field ionization and APCI have been reported in the literature for ionization of large, saturated hydrocarbons [
228,
229,
230], but one should always critically evaluate and, if possible, tailor the ionization conditions to be most suitable for the petroleum sample being analyzed [
144].
Significant advances in the chemical characterization of petroleum has been achieved by applying of two-dimensional gas chromatography (GC×GC). Implementing of a polar column followed by a nonpolar column improves the separation of saturated and aromatic hydrocarbons in petroleum distillates [
231,
232]. Combination of two-dimensional gas chromatography with (+) EI TOF MS give opportunity to determine separating compounds in an oxidized heavy saturated hydrocarbon fraction [
233], naphthenic acids separated from petroleum [
234], composition of a heavy crude oil and to generate a template that can be used to simulate its distillation [
235]. In addition, separation of some isomers of large saturated hydrocarbons and their sulfides from Venezuelan crude oil sample by applying of GC×GC/(+) EI TOF MS have been reported [
236].
Vanini et al. [
237] offered a method to analyze vaporous hydrocarbons by GC×GC/(+) EI TOF MS and (−) ESI 9.4 T FT-ICR MS to analyze acidic compounds in the same crude oil samples. Various groups of components were identified by using GC×GC/(+) EI TOF MS, including linear, branched, and cyclic saturated hydrocarbons, alkylbenzenes, alkylnaphthalenes, alkylindanes, pyrenes, and fluorenes in petroleum samples with different density [
144,
237].
Integration of three analytical techniques, high-performance liquid chromatography (HPLC) ring-type separation, two-dimensional nuclear magnetic resonance spectroscopy (2D NMR), and ultrahigh-resolution mass spectrometry (MS), for molecular-level characterization of crude oil was employed by Kim et al. [
238] for detailed characterization of its compositions (
Figure 9).
HPLC ring type separation was used to obtain five petroleum fractions. The features of each fraction obtained by three techniques complied well and the aromaticity enhanced as the fraction number magnified. Additionally, 2D NMR data and double bond equivalence (DBE) distribution derived from high-resolution-MS, it was shown that the first to third fractions of HPLC ring type separation contained hydrocrabon group components with an aromatic ring core and multiple saturated cyclic rings. The structures and distribution of heteroatom group components could be clarified based on combined information on the HPLC elution order and the class and DBE distributions seen using ultrahigh-resolution mass spectrometry [
238].
Nuclear magnetic resonance (NMR) and infrared (IR) spectroscopy are two analytical methods that generate a large volume of compositional information [
12]. Many studies have appeared recently which suggest approaches based on application of 1H and 13C NMR and near (NIR) and mid (MIR) infrared spectroscopy techniques associate with chemometrics to predict crude oil properties [
12,
120,
239,
240,
241,
242,
243,
244,
245,
246,
247,
248,
249,
250]. Data obtained by1H and 13C NMR and near (NIR) and mid (MIR) infrared spectroscopy techniques are used as an input data for modeling petroleum properties by applying of the regression techniques as principal component regression (PCR) [
242], multiple linear regression (MLR) [
250], partial least square PLS [
242,
247,
248] and artificial neural network (ANN, random forest) [
241]. Many of the recently published studies continued to apply linear PLS [
239,
245,
251,
252] for petroleum properties modeling on the base of NIR and MIR spectra because of its simplicity and effectiveness for data with linear behavior. Most of petroleum bulk properties can be defined directly. For some properties such as density, and TBP curve, the relation between spectrum and property is implicit due to the absence of absorption band defined by these properties [
12]. Actually, the spectra are related to some other component, which in turn is proportional to the modeled property. To the best of our knowledge, no reports have appeared yet to predict petroleum salt and ash content based on NIR and MIR spectra.