the

: One of the most popular ways of representing any robotic model mathematically is through Denavit-Hartenberg (DH) parameter table. And the most common way of finding a forward kinematics solution to any robotic model is by finding its homogenous transformation matrix, which is obtained from the DH parameter table by a certain set of steps or algorithms. In this research work, we have tried solving this problem in just a single step by deep learning method and thus finding forward kinematics of almost any kind of manipulator. This research work shows not just this problem but many more such complex problems which require a certain set of steps or algorithms that can be solved by deep learning techniques in a single step. The results obtained are very close to accurate and show the ability of deep learning techniques for solving different kinds of such problems.


Introduction
Robotic technology has an impact on every element of life, both at work and at home. Robotics can improve people's lives and work by performing them in a more efficient way and at the same time increasing safety with better service. We all know, robotics is set to be the leading technology for the next generation of gadgets, by their learning capability, and ability to interact with the surrounding, bridging the digital and physical worlds. In big industrial businesses, it has already become essential for flexibility and competitiveness. On the other hand, in non-manufacturing businesses like agriculture, healthcare, security, and transportation services, robots will have a significant impact. With the growth in these industries, it is expected that expansion in this domain will be even more. Figure 1 shows different domains of robotics study. The majority of robotics challenges necessitate modelling. Kinematics analysis and dynamic analysis are both required for robotic modelling. Most of the robotics implementations, such as frame transformation [1], movement of hand [2,3], stability and control of skeleton or posture [4,5], three-dimensional (3D) transformation [6], bipedal robot trajectory analysis or prediction [7,8], and manipulators for industrial applications [9,10], trajectory tracing [11] necessitate solving the inverse kinematics issue to determine joint trajectories. The kinematics model examines the system's basic geometry, on the other hand, the dynamics of a model examines the model when it is in motion. The association between forward kinematics and inverse kinematics is depicted in Figure 2.  The most common representation of any type of robot is the Denavit-Hartenberg (DH) representation [12]. It can aid in the detection of both forward and inverse kinematics issues. The DH parameter is a mathematical representation of the robot link structure. It can be used to obtain forward and inverse kinematics solutions by calculating the homogenous matrix. In this research, we will first explain a step-by-step way for obtaining a homogenous transformation matrix from a DH parameter table, and then use deep learning techniques to solve the same problem in a single step.

DH frames
DH frames assist us in deriving the equations necessary to control a robotic arm. We use four rules to draw the frames (i.e., the x, y, and z axes), which will allow us to take a shortcut while deriving the robot's mathematics. The DH convention is the name given to all of these rules.
The following are the four rules for drawing DH coordinate frames: (a) A revolute joint's rotation axis is called the z-axis. (b) Both the current z-axis and the prior z-axis must be perpendicular to the x-axis. (c) Using the right-hand coordinate system, the y-axis is calculated from the x-and z-axes. (d) The preceding z-axis must intersect the x-axis (the rule does not apply to frame 0). Figure 3 shows the Universal Robot-UR5 is shown in the zero position or a flat orientation parallel to the plane in the illustration above. An end effector, such as a hand, gripper, or suction cup, is attached to the robotic arm's end. Let's try drawing DH frames of Universal Robot-UR5 using the principles above. It's a well-known 6 DOF industrial robot that's been utilized as a standard in numerous experimental researches.
The DH frames of the aforesaid UR5 robot at zero position will be as shown in Figure 4 using the four rules of the DH convention.

Homogenous transformation matrices
It allows combining rotation matrices (3 rows, 3 columns) with displacement matrices (3 rows, 1 column) into a single matrix. They're a crucial part of forward kinematics. Let's talk about rotation and displacement matrices first, and then we'll talk about transformation matrices.
Rotation matrices: The angles of a robotic arm can be described using these matrices (i.e. to indicate the direction of the robotic arm). It aids us in detecting how a robot's end effector (e.g., robotic gripping hand, paintbrush, suction cup, etc.) will change its orientation with changes in servo motor angles.
In other words, it is a mathematical representation of a robotic system's orientation. The three coordinate axes (x, y, and z) indicating the position of an object in 3D in one frame are transformed to another frame coordinates via rotation matrices.
Rotation matrices differ depending on which axis the frame rotated. Equations (1) matrices for rotation along the x, y and z axes respectively.

Displacement vectors:
A displacement vector is a list of three values in a single column that represents the displacement (or displacement in position) of a frame with respect to some other frame in coordinate axes.
To indicate the change in the position of coordinate frame n with respect to coordinate frame m, we'll use the notation as in Equation (4).
We can simply multiply the rotation matrices of each joint to determine the orientation of the end effector with respect to the base frame as represented by Equation (5).
However, this is not the case for displacement vectors. For determining the position of the end effector frame with respect to the base frame, we can't just combine displacement vectors as in Equation (6).
A homogenous transformation matrix [13] is required to solve this problem. The rotation matrix is augmented with the displacement vector to form a single matrix called a homogeneous transformation matrix. Equation (7) is an example of a homogeneous transformation.
The reason for calculating transformation matrices is that, like rotation matrices, we can multiply two homogeneous matrices together. Let's say the transformation matrix from frame 0 to frame 2 is homogen 0 2. To do so, multiply the transformation matrix from frame 0 to 1 by the transformation matrix from frame 1 to 2 as in Equation (8).
DH parameter: We can use DH frames to find DH parameters, and we can also use them to find homogeneous transformation matrices. The DH parameter is a set of four variables that aid in the mathematical representation of each joint. According to [14], the DH parameters are as follows.
The DH parameters for UR5 are given in Table 1. Link DH characteristics of a UR5 robot, modified to correspond to the frames in Figure 4. The first parameter, i, is variable, whereas the rest are constants.
The transformation for each connection can be written using the DH parameters by Equation (9).
Multiplying all six transformation matrices as in Equation (8)

Deep learning
All of the preceding mathematics can be avoided by using deep learning techniques to solve the problem. Deep learning is a part of machine learning which in turn is a part of artificial intelligence. Artificial intelligence refers to methods that allow machines to emulate human-like intelligent behaviour. Deep learning is a part of machine learning, Figure 5 that is influenced by human brain structure. Deep learning algorithms analyse inputs with a predetermined logical structure in order to reach similar conclusions as humans. Deep learning achieves this by employing a multi-layered neuron structure known as neural networks as shown in Figure 6. Different layers of neural networks are nothing but a kind of filter which works from the most obvious to the most subtle, improving the possibility of predicting the right result. These kinds of neural network models are capable of solving problems that machine learning models can't.
The lack of necessity for so-called feature extraction gives an edge to deep learning. Feature extraction is usually quite difficult and necessitates a thorough understanding of every part of a problem. For best results, pre-processing is a must, it should be customised, tested, and optimised. Deep learning, on the other hand, doesn't require feature extraction.
The layers can directly and independently learn an implicit likeness of the raw input. Over successive layers of neural networks, which helps in forming a more compressed and more abstract form of raw data. The result is then generated using this compact form of the data. The result could then be used for the classification of data into different classes. Consider the case shown in Figure 7. Thus, we can also say that feature extraction is included in the artificial neural network's operation. In reality, avoiding data characteristics extraction applies to every single activity you'll ever perform using neural networks. Simply feed the raw inputs to the network, and the model will handle the rest.

Preparation of dataset
As discussed in the last section, we need to provide data for training our neural network. For this, we require some datasets. There is no publicly available dataset for training neural networks for finding homogenous transformation matrix. So, we implied the mathematical approach discussed in Section 2 to generate a large size of data so that we can train our model properly.
At first, we randomly assigned values to different DH parameters (i. e., a, d, r, and θ). Based on the number of links and DOF, multiple sets of DH parameters are required. For example, a 3 DOF, 3 link manipulator required 3 sets of DH parameters as shown in Table 2 We generated 10,000 such values and then using Equation (9), we calculated the homogenous transformation matrix. As discussed earlier, the homogenous transformation matrix is a 4 × 4 matrix, but we have stored it in a single row, thus our output is these 16 values and the input is 4 × n, where n is the number of links. In the above case, 3 DOF 3 link manipulator, it will be 12.

Architecture
There is no specific rule in neural networks about what kind of neural network should be used for what purpose. It is broadly of three types Feed Forward Neural Network, Recurrent Neural Network and Convolutional Neural Network. Each one is further categorized into different kinds. For our research work, we found out by experiment that a feed-forward multi-layered neural network is best suited [15,16]. Thus, we tried different feed-forward neural network architectures having different layers of neurons, different activation functions and tested different drop-out and regulariser methods to find the best-suited neural network for our work.
Neural network structure varies for different robotic models as it will have a different number of input nodes. The neural network for 6 DOF, 6 link manipulator is shown in Figure 6. By trying out different activation functions, we found out that the sigmoid (11) and a rectified linear unit (ReLU) activation function (13) is most suited. In the hidden layer, we have employed the sigmoid function while in output neurons we have employed the ReLU function.
The neural network needs to start with some weights and then iteratively update them to better values. In our case, we have used uniform distribution to initialize all of the weights. During the training of the neural network, these initialized weights are modified in order to reduce the losses. An optimizer is a function or an algorithm that modifies the attributes of the neural network, such as weights and learning rates. There are different types of optimizers: Gradient Descent, Stochastic Gradient Descent, Adagrad, Adadelta, Adam, etc. We have found out experimentally that the Adam optimizer is most suitable for our work. It gives minimum losses.
So basically, we will reduce this algorithmic approach of finding a homogenous transformation matrix into a single step by neural network technique. The whole concept is explained with the help of the flow diagram in Figure 8. To test the efficiency of neural network in finding homogenous transformation matrix and forward kinematics, we have taken 3 types of manipulator. First with 3 links 3 DOF, second with 5 links 6 DOF, and last with 11 links 12 DOF. We have prepared separate datasets for each manipulator. The neural network employed in all the cases is almost the same only they have different numbers of input nodes. We'll utilise 10% of the datasets we've produced for validation, 15% for testing, and the remaining 75% for training the network.
As this is a regression problem, we'll use root mean squared error (RMSE) and the cosine similarity value method to assess our model's performance. Mean squared error (13) is the most common way of analysing regression models. It determines the mean of the squares of the errors between actual output and predicted output.
Cosine similarity (14) is a mathematical tool that computes the similarity between actual output and predicted output. It's a number that ranges from -1 to 1. When the value is a negative integer between -1 and 0, where 0 denotes orthogonality whereas numbers nearer to -1 suggest a higher resemblance. The closer the value is to 1, the higher the dissimilarity. As a result, it may be used as a loss function in a situation where the goal is to minimise the distance between targets and forecasts. Regardless of the distance between targets and predictions, cosine similarity will be 0 if either y true or y pred is a zero vector.
Loss = -∑[ (l 2 _norm (y true ) * l 2 _norm (y pred )] Figures 9 and 10 show the loss curves for 6 DOF manipulator while training the model for predicting the homogenous transformation matrix. The first curve shows the RMSE loss during training and validation while the second curve shows cosine loss during training and validation of the model. As can be seen from the loss curves, the loss is very little thus our model is quite accurate in learning the hidden patterns in the problem. DH

Conclusion
As we can see from the results, the neural network is capable of calculating the rotation matrix, displacement matrix and thus finding forward kinematics with RMSE of around 0.006 in most of the cases. Hence, we can say that predicted results are very close to actual results. Also, we can say that neural networks are capable of solving the algorithmic problem just like solving forward kinematic problems from the DH parameter table in our case.