Danomics provides beta access to some of our internal R&D tools before they are officially supported. This gives customers a way to access cutting edge technologies while we collect customer feedback. While in the R&D stage these tools are regularly enhanced and modified, and backwards compatibility is not always guaranteed.
ML-Based Log Prediction
Danomics’ machine-learning based log prediction tools were trained using a data set comprising over 10,000 well logs from around the globe. The data comprises both conventional and unconventional reservoirs and a wide array of lithologies. Prior to model training the data was analyzed and cleaned using Danomics’ interpretation ready data workflows. This includes:
- Unit conversions, lithological reference standardization, and casting all curves in a common curve family type
- Correction and/or removal of washout intervals
- A statistical analysis of the curve data to remove spurious, nonsensical, and erroneous values
Benchmarks on curve prediction were then established using published relationships when available (e.g., the Gardner and Faust relations for DTC). Further benchmarks were then established using random forest and gradient boosting methodologies. In all cases the mean absolute error (MAE) was used as the metric for judging model efficacy as it was deemed to be more understandable than metrics such as mean squared error. Multilinear regression models were also tested, but discarded due to poor performance across multi-well predictions (despite their excellent performance for curve reconstruction within a well). Benchmark values for accuracy for compression sonic (DTC) prediction are:
- Gardner and Faust relationships: 11-15 usec/ft (both are very lithology dependent)
- Random Forest: 5.7-6.5 usec/ft (depending on model architecture)
- Gradient Boosting: 5.8-6.3 usec/ft (depending on model architecture)
For photoelectric factor (PEF) logs benchmark accuracy tests were:
- Random Forest: 0.72-0.90 b/e (depending on model architecture)
- Gradient Boosting ; 0.68-0.83 b/e (depending on model architecture)
Danomics tested a wide range of neural network architectures including shallow-learning and deep-learning models with various depths and widths (e.g., nodes/layer). Models efficacy was then tested using a blind set of test wells that were selected at random using a process to ensure spatial isolation. When judging models the following criteria were used:
- Accuracy as measured by mean absolute error
- Repeatability when trained/tested on various data sets
- Training and prediction speed (important for continuing model improvements and user experience)
- Model export size (only used when a vast disparity exists for similar performance)
Our internal testing showed that moderately deep and reasonably narrow neural networks showed both the best and most repeatable results. Extremely deep networks did not show additional benefits (which likely speaks to the correlative nature of the data). Care was taken to not use extremely wide networks which tend to “memorize” data relationships while not showing robust general performance in blind testing. Results from the selected architectures are:
- DTC: 5.4-5.7 usec/ft (MAE)
- PEF: 0.64-0.68 b/e (MAE)
These modestly outperformed the benchmark machine learning methods (Random Forest and Gradient Boosting), but trained several orders of magnitude faster and resulted in significantly smaller model sizes (100s of kbs vs. several GB, which means faster access for users).
Using the Model
The models for DTC and PEF log prediction are available using Danomics Flows. The overall flow comprises the following tool stack: LogInput >> CPILogCalc >> LogPredictor >> DeleteLogCurve (x4) >> LogOutput as shown below:
The “Python” block will show up as “LogPrediction” for users. The logic for each block is given below:
- LogInput: Select the database to use for predictions
- CpiLogCalc (see screenshot below) is used to add the GR_Final, RhoB_Final, Nphi_Final, and Resd_Final curves to the database. This is done so that we are working on aliased, composited, unit corrected curves that have (optionally) been normalized and repaired for washout.
- LogPrediction: Choose what curves to predict and what curves to use for prediction
- DeleteLogCurve: The CpiLogCalc block added the “_Final” curves to the log database – as further work will be performed on the resulting database, these were removed.
- LogOutput: Provide a name for the resulting database.
The CpiLogCalc block was used so that we could add the “_Final” curves to the Flow for use in the prediction. Note that you don’t need to provide a CPI file – if you do, it will use that CPI to apply things like normalization and washout repair. If you don’t select a CPI it will just access the config to composite, alias, and standardize curves.
The DTC and PEF models try to apply a model that uses GR, RhoB, Nphi, and ResD. If ResD is not available it will attempt a prediction using a secondary model that does not use resistivity as an input. Future work may include support for sparser curve datasets. Depth steps without GR, RhoB, and Nphi will not receive predictions Wells missing one of the required curves will also not receive a prediction.
Tips and Tricks
- Use the CpiLogCalc to access the “_Final” curves.
- Work a preliminary CPI through the Badhole ID & Repair module to ensure that you are not making predictions with data from washout intervals
- Add the DTC and PEF results to your alias table overrides so they will be available for interpretations.
- Evaluate the results by comparing the predicted curves to actual curves. Is it working within your zones of interest? Where does it break down?
Remember that if you need help that you can always reach us at firstname.lastname@example.org.