This content originally appeared on DEV Community and was authored by Dipti
Introduction
A neural network is an information-processing model inspired by the human brain. Just like our nervous system is made up of interconnected neurons, a neural network consists of interconnected processing units (called nodes or artificial neurons).
The key strength of neural networks lies in their parallel processing ability. Unlike traditional linear models, they can capture complex, non-linear relationships within data. This makes them highly effective for tasks such as pattern recognition, prediction, and classification in large or unstructured datasets.
However, despite their power, neural networks are often called a “black box.” Users can clearly see the inputs and outputs, but the decision-making process inside is not always easy to interpret. In this article, we’ll take a closer look at how neural networks work, and we’ll also walk through a hands-on example of fitting and evaluating a neural network in R.
Table of Contents
The Basics of Neural Networks
Fitting Neural Network in R
Cross Validation of a Neural Network
The Basics of Neural Networks
At the core of every neural network is an activation function. This function helps transform inputs into outputs as information flows from one layer of the network to the next.
The input layer receives raw data.
Hidden layers transform that data using weighted connections.
The output layer generates the final result.
What makes neural networks powerful is their ability to learn from data. By adjusting weights between connections, they improve predictions over time. This is similar to how humans learn from experience.
The Perceptron: The Starting Point
The simplest neural network is a perceptron, or single-layer neural network. It takes multiple inputs, performs a weighted summation, and applies an activation function to produce an output.
While perceptrons can solve simple problems, they struggle with non-linear patterns. For example, they cannot solve the classic XOR problem. This limitation is overcome by multi-layer networks, where input passes through one or more hidden layers before reaching the output.
Learning Rules and Backpropagation
Neural networks rely on learning rules to optimize weights:
Least Mean Square
Gradient Descent
Newton’s Rule
Conjugate Gradient
Most commonly, these are combined with the backpropagation algorithm. Backpropagation works by calculating the error at the output layer and sending it backward through the network. Each connection adjusts proportionally to its contribution to the error, allowing the network to learn and improve accuracy.
Fitting Neural Network in R
To bring theory into practice, let’s fit a neural network in R. We’ll use the cereals dataset from Carnegie Mellon University, where the goal is to predict the rating of cereals based on features like calories, protein, fat, sodium, and fiber.
Step 1: Read and Split the Data
Read the data
data = read.csv("cereals.csv", header = TRUE)
Random sampling (60% training, 40% testing)
samplesize = 0.60 * nrow(data)
set.seed(80)
index = sample(seq_len(nrow(data)), size = samplesize)
datatrain = data[index, ]
datatest = data[-index, ]
We split the dataset into training and test sets. The training set helps the model learn, while the test set checks how well it generalizes.
Step 2: Scale the Data
Scaling is essential because variables on different scales can distort results. We’ll use min-max normalization, which rescales data to the same range (0–1) while preserving distribution.
Scale data using min-max normalization
max = apply(data, 2, max)
min = apply(data, 2, min)
scaled = as.data.frame(scale(data, center = min, scale = max - min))
Step 3: Build and Visualize the Neural Network
We’ll use the neuralnet package in R. Our model has one hidden layer with 3 neurons.
Install and load library
install.packages("neuralnet")
library(neuralnet)
Split scaled data
trainNN = scaled[index, ]
testNN = scaled[-index, ]
Fit the neural network
set.seed(2)
NN = neuralnet(rating ~ calories + protein + fat + sodium + fiber,
data = trainNN, hidden = 3, linear.output = TRUE)
Plot the network
plot(NN)
The diagram shows inputs, hidden neurons, and outputs, with weights optimized through backpropagation.
Step 4: Make Predictions
Now, we’ll predict cereal ratings on the test set. Since predictions are scaled, we must transform them back to the original scale.
Predict on test data
predict_testNN = compute(NN, testNN[, c(1:5)])
predict_testNN = (predict_testNN$net.result *
(max(data$rating) - min(data$rating))) + min(data$rating)
Compare predicted vs actual
plot(datatest$rating, predict_testNN, col='blue', pch=16,
ylab = "Predicted Rating", xlab = "Actual Rating")
abline(0,1)
Calculate RMSE
RMSE.NN = (sum((datatest$rating - predict_testNN)^2) / nrow(datatest)) ^ 0.5
Our model achieved an RMSE of 6.05, which indicates a decent predictive performance.
Cross Validation of a Neural Network
Evaluating models with only a training-test split can be misleading because results may vary depending on the split. To ensure robustness, we use cross-validation.
K-Fold Cross Validation
Here, data is split into k subsets. Each subset is used once as a test set, while the remaining k-1 subsets form the training set. This process repeats k times, giving a more reliable measure of performance.
In our case, we vary the training set size from 10 to 65, repeat sampling 100 times, and compute RMSE for each run.
Cross-validation setup
install.packages("boot")
install.packages("plyr")
library(boot)
library(plyr)
set.seed(50)
k = 100
RMSE.NN = NULL
List = list()
for(j in 10:65){
for (i in 1:k) {
index = sample(1:nrow(data), j)
trainNN = scaled[index,]
testNN = scaled[-index,]
datatest = data[-index,]
NN = neuralnet(rating ~ calories + protein + fat + sodium + fiber,
data = trainNN, hidden = 3, linear.output = TRUE)
predict_testNN = compute(NN, testNN[,c(1:5)])
predict_testNN = (predict_testNN$net.result *
(max(data$rating) - min(data$rating))) + min(data$rating)
RMSE.NN[i] = (sum((datatest$rating - predict_testNN)^2) / nrow(datatest)) ^ 0.5
}
List[[j]] = RMSE.NN
}
Matrix.RMSE = do.call(cbind, List)
We then visualize RMSE variation across training set sizes.
Boxplot for training size = 65
boxplot(Matrix.RMSE[,56], ylab = "RMSE",
main = "RMSE BoxPlot (Training Size = 65)")
Median RMSE vs training set size
install.packages("matrixStats")
library(matrixStats)
med = colMedians(Matrix.RMSE)
X = seq(10,65)
plot(med ~ X, type = "l", xlab = "Training Set Size",
ylab = "Median RMSE", main = "Variation of RMSE with Training Set Size")
The results show a clear trend: as training size increases, RMSE decreases. Larger training sets improve accuracy and stability.
End Notes
In this article, we:
Explored the fundamentals of neural networks and their similarity to biological nervous systems.
Implemented a neural network in R using the neuralnet package.
Evaluated performance using RMSE and strengthened reliability with cross-validation.
The key takeaway is that while neural networks can be complex, they are extremely powerful for solving non-linear problems. Their accuracy depends not just on architecture, but also on the size and quality of training data.
- For practitioners, always remember:
- Scale your data before fitting.
- Evaluate models beyond a single train-test split.
- Use cross-validation to confirm robustness. By following these principles, neural networks can become less of a “black box” and more of a practical, trustworthy tool in your analytics toolkit.
Drive smarter decisions with a trusted Power BI consulting to design custom analytics solutions, expert Tableau Consultant for powerful data visualization, and seasoned Snowflake Consultants to optimize your cloud data warehouse. Together, these services turn complex datasets into clear, actionable insights.
This content originally appeared on DEV Community and was authored by Dipti

Dipti | Sciencx (2025-08-24T11:18:56+00:00) Neural Network in R. Retrieved from https://www.scien.cx/2025/08/24/neural-network-in-r/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.