What is the training of machine learning models to make a sequence of decisions?

Neurocontrol in Sequence Recognition

William J. Byrne, Shihab A. Shamma, in Neural Systems for Control, 1997

1 Introduction

Central to many formulations of sequence recognition are problems in sequential decision-making. Typically, a sequence of events is observed through a transformation that introduces uncertainty into the observations, and based on these observations, the recognition process produces a hypothesis of the underlying events. The events in the underlying process are constrained to follow a certain loose order, for example by a grammar, so that decisions made early in the recognition process restrict or narrow the choices that can be made later. This problem is well known and leads to the use of dynamic programming [DP] algorithms [Bel57] so that unalterable decisions can be avoided until all available information has been processed.

DP strategies are central to hidden Markov model [HMM] recognizers [LMS84,Lev85,Rab89,RBH86] and have also been widely used in systems based on neural networks [e.g., [SIY+89,Bur88,BW89,SL92,BM90,FLW90]] to transform static pattern classifiers into sequence recognizers. The similarities between HMMs and neural network recognizers are a topic of current interest [NS90,WHH+89]. The neural network recognizers considered here will be those that fit within an HMM formulation. This covers many networks that incorporate sequential decisions about the observations, although some architectures of interest are not covered by this formulation [e.g., [TH87,UHT91,Elm90]].

The use of dynamic programming in neural network-based recognition systems is somewhat contradictory to the motivating principles of neurocomputing. DP algorithms first require precise propagation of probabilities, which can be implemented in a neural fashion [Bri90]. However, the component events that make up the recognition hypothesis are then found by backtracking, which requires processing a linked list in a very nonneural fashion.

The root of this anomaly is that the recognition process is not restricted to be local in time. In the same way that neural computing emphasizes that the behavior of processing units should depend only on physically neighboring units, the sequential decision process used in recognition ideally should use only temporally local information. Dynamic programming algorithms that employ backtracking to determine a sequence of events are clearly not temporally local.

This problem has also been addressed in HMMs. In many applications, it is undesirable to wait until an entire sequence of observations is available before beginning the recognition process. A related problem is that the state space required by the DP algorithms becomes unmanageably large in processing long observation sequences. As solutions to these problems, approximations to the globally optimal DP algorithms are used. For example, the growth of the state space is restricted through pruning, and real-time sequence hypotheses are generated through partial-traceback algorithms.

Suboptimal approximations to the globally optimal DP search strategies are therefore of interest in both HMM and neural network sequence recognition. One approach to describing these suboptimal strategies is to consider them as Markov decision problems [MDPs] [Ros83]. In this work the theoretical framework for such a description is presented. The observation sequence is assumed to be generated by an HMM source model, which allows the observation and recognition process to be described jointly as a first-order controlled Markov process. Using this joint formulation, the recognition problem can formulated as an MDP, and recognition strategies can be found using stochastic dynamic programming.

The relationship of this formulation to neural network -based sequence recognition will be discussed. A stochastic neural network architecture will be presented that is particularly suited to use in both sequence generation and recognition. This novel architecture will be employed to illustrate this MDP description of sequence recognition. The intended application is to speech recognition.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780125264303500040

12th International Symposium on Process Systems Engineering and 25th European Symposium on Computer Aided Process Engineering

Joohyun Shin, Jay H. Lee, in Computer Aided Chemical Engineering, 2015

5 Conclusions

In this study, a general inventory control problem for supplying raw materials to manufacturers has been formulated as a MDP to incorporate supply and demand uncertainty. MDP formulation enables the use of both the physical dynamics and the flow of information in sequential decision making. The problem was solved by the dynamic programming method of VI. The performance gain from employing the more rigorous MDP formulation compared to the popular [s, S] policy was verified through a case study. The results showed that the proposed method can reduce the costs by capturing uncertainties, and considering multiple criteria for supplier selection such as lead time, replenishment cost, and limits on the order quantity. Additionally, an approximation framework using stochastic simulations and linear function approximation was tested to reduce the computational burden of exact VI. It can be an excellent alternative to the exact dynamic programming to broaden the applicability of the MDP approach to practically sized problems at minimal performance loss.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780444635761500121

13th International Symposium on Process Systems Engineering [PSE 2018]

Jiyao Gao, Fengqi You, in Computer Aided Chemical Engineering, 2018

2 Problem Statement

As mentioned in the previous section, the proposed modeling framework integrates the leader-follower Stackelberg game with two-stage stochastic programming approach. We aim to simultaneously optimize the design and operations decisions of different stakeholders under uncertainty. The Stackelberg game is selected to depict the sequential decision-making process among different players. Specifically, the shale gas producer in the shale gas supply chain is identified as the leader that enjoys decision-making priority. The leader’s decisions comprise “here-and-now” design decisions, including drilling schedule at each candidate shale site, instalment of gathering pipelines, and processing contract selection, and operational decisions that are made “wait-and-see”, namely production profile of shale gas, amount of raw shale gas transported to each existing processing plant, and water management strategy. Meanwhile, the midstream shale gas processing companies are identified as followers in the supply chain. After the observation of leader’s decisions, the follower will react rationally and make corresponding decisions, including design decisions regarding the unit process fee associated with the processing contracts, and operational decisions on the planning of processing and distribution activities.

Following the stochastic programming approach, uncertainties are depicted with discrete scenarios with given probabilities. Thus, both the leader and the followers strive to maximize their own expected net present value [NPV].

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780444642417502627

29th European Symposium on Computer Aided Process Engineering

Rajesh Govindan, Tareq Al-Ansari, in Computer Aided Chemical Engineering, 2019

5 Conclusions

The study discussed in this paper provides an illustrative example of how simulation-based optimisation can aid in making large-scale logistical decisions for network-based applications having economic implications. With the recent advent of artificial intelligence, non-linear, multi-objective and sequential decision-making techniques such as reinforcement learning is useful in solving hard problems whose objective functions do not have closed form analytical representations, such as the CO2 fertilisation network for the enhancement of agricultural productivity.

It is envisaged that the results obtained have important implications for planning the expansion of greenhouse networks, particularly for the State of Qatar, where self-sustenance and food security has become a priority. In this regard, it has been considered that both the environmental and economic dimensions are equally important. As such, the utilisation of waste streams such as CO2 emissions for growing crops not only provides for food security, but also enhances the efficiencies of energy and water utilisation as highlighted in the research carried out by the authors.

With regards to the limitations of the current work, only a single agent has been considered thus far and hence there is scope to extend the work to develop multiple learning agents in large-scale network optimisation. The business logic considered are also likely to be simplified for the current illustration, and hence the simulator requires incorporating agent functions for real world scenarios, e.g. in multi-echelon supply chains where dynamic distribution of resources, inventory and supply risk management and long-term supply chain planning and development are some of the core problems to be generally solved.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780128186343502526

Constraint Optimization

Rina Dechter, in Constraint Processing, 2003

13.8 Bibliographical Notes

Algorithms for discrete combinatorial optimization tasks were developed in the past five decades by the operations research community under the umbrella term integer programming. While it is known how to solve optimization problems when the constraints and cost functions are linear over continuous domains, the task is hard when the domains are discrete. Branch-and-bound search for integer programming uses a lower bounding function derived by relaxing the integrality constraints and solving a continuous linear programming problem [Fulkerson, Dantzig, and Johnson [1954]. This idea of branch-and-bound can be traced back to the work of Fulkerson, Dantzig, and Johnson [1954, 1959] on integer programming and the traveling salesman problem [Rinnoy Kan et al. 1985]. An early survey appears in Lawler and Wood [1956].

In the late 1970s and early 1980s, formal descriptions of the branch-and-bound methods were given by Ibaraki [1976, 1978] and Kohler and Stieglitz [1974] in the more general setting of state-space search, and a relationship with dynamic programming for sequential decision making was discussed [Ibaraki 1978; Kumar and Kanal 1983]. In artificial intelligence, branch-and-bound algorithms were investigated under the umbrella term heuristic search. It was recognized that best-first search and depth-first branch-and-bound were special cases of branch-and-bound [Nillson 1980; Pearl 1984].

Dynamic programming was developed by Bellman [1957] as an alternative to branch-and-bound search. He introduced the idea in the context of sequential decision making. The perception of nonserial dynamic programming as a variable elimination algorithm is described in detail in Bertele and Briochi [1972]. They observed the dependence of nonserial dynamic programming, or variable elimination, on an order-based graph parameter that they called “dimension” and that we call induced width. They also presented several greedy algorithms for bounding a graph dimension.

Constraint processing in the past two decades has shifted toward constraint optimization. The work of extending backtracking algorithms for optimization was initiated by Freuder [1992]. More recent research has focused on extending constraint propagation ideas to optimization, especially for bounding the evaluation functions. A variety of lower bound functions were developed that can be related to constraint propagation [Meseguer, Larrosa, and Schiex 1999; Kask and Dechter 2001; Schiex 2000]. The first-cut lower bound function is an example of extending forward-checking. The mini-bucket scheme presented in this chapter was introduced by Dechter [1997] and Dechter and Rish [1997, 2003], and its use for lower bounding function generation was introduced by Kask and Dechter [2001]. The Russian doll idea is due to Lematre, Verfaillie and Schiex [1996].

A significant effort addressing a variety of soft constraint types and algorithms was introduced recently. There are several frameworks for soft constraints, such as the semi-ring-based formalism [Bistarelli, Montanari, and Rossi 1997], where each tuple in each constraint has an associated element taken from a partially ordered set [a semi-ring], and the valued constraint formalism, where each constraint is associated with an element from a totally ordered set. These formalisms are general enough to model classical constraints, weighted constraints, fuzzy constraints, and overconstrained problems. Current research effort is focused on extending propagation and search techniques into this more general framework.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9781558608900500141

27th European Symposium on Computer Aided Process Engineering

Chao Ning, Fengqi You, in Computer Aided Chemical Engineering, 2017

1 Introduction

In recent years, robust optimization has gained increasing popularity [Bertsimas et al., 2011], and achieved success in a broad array of applications, such as biofuel supply chain [Tong et al., 2014], biomass processing [Gong et al., 2016] and batch process scheduling [Shi et al., 2016]. Two-stage adaptive robust optimization [ARO] grants more flexibility than static robust optimization by introducing recourse decisions [Ben-Tal et al., 2004], and typically generates less conservative solutions [Gong et al., 2017]. To overcome the limitation of two-stage structures, multistage ARO is recently proposed to offer a new paradigm for non-anticipative sequential decision making processes [Delage et al., 2015]. Recently, a nested stochastic robust optimization was proposed to handle multiscale uncertainties [Yue et al., 2016]. Notably, most of the existing approaches hedge against the uncertainty realization regardless of its occurrence probability, thus generating over-conservative solutions.

Data-driven robust optimization has been proposed recently [Bertsimas et al., 2013]. In the data-driven framework, uncertainty sets are directly constructed from uncertainty data. Despite the attractive features of data-driven approaches, most existing publications in this area are restricted to static robust optimization. To the best of our knowledge, data-driven approach for multistage ARO has not been considered in the existing literature. Therefore, this paper aims to fill the knowledge gap. Although we only present the application of scheduling, the proposed framework can handle general applications on optimization under uncertainty, such as those for process network planning [You et al., 2011], supply chain optimization [Garcia et al., 2015], biofuel energy systems [Yue et al., 2014], shale gas energy systems [Gao et al., 2017], process design and synthesis [Gong et al., 2015], and sustainability [Garcia et al., 2016].

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780444639653503792

27th European Symposium on Computer Aided Process Engineering

Chao Ning, Fengqi You, in Computer Aided Chemical Engineering, 2017

1 Introduction

In the past few decades, optimization of process systems under uncertainty has attracted wide attention from both academia and industry. Robust optimization [RO] emerges as a popular approach due to its strong ability to hedge against uncertainties and its computational tractability [Bertsimas et al., 2011]. It has achieved success in a broad array of applications, such as biofuel supply chain [Tong et al., 2014], biomass processing [Gong et al., 2016] and batch process scheduling [Shi and You, 2016]. Traditional RO approaches, also known as static robust optimization, make all the decisions at once. This framework does not fit well with sequential decision-making problems. To this end, adaptive or adjustable robust optimization [ARO] was proposed to offer a new paradigm for optimization under uncertainty by incorporating recourse decisions [Ben-Tal et al., 2004]. Due to the flexibility of adjusting some decisions to counteract uncertainties, ARO typically generates less conservative solutions than static robust optimization [Yue and You, 2016; Gong and You, 2017]. Big data is reshaping both operations research and process systems engineering. More recently, dramatic progress of mathematical programming methods, coupled with recent advances in machine learning algorithms, sparks a flurry of interest in the research field of data-driven optimization [Bertsimas et al., 2013]. Traditional ARO approach typically fails to take full advantage of data, and makes a priori and simple assumptions about uncertainty, such as independence and symmetry. These assumptions may not be reasonable for real world applications. We propose a data-driven ARO framework and apply it to the process network planning problem. Although we only present the application of this DDANRO framework to planning, the proposed framework is general enough to handle a variety of applications on optimization under uncertainty, such as those for scheduling [Chu and You, 2015; Wassick et al., 2012], supply chain optimization [Garcia and You, 2015], energy systems [Yue et al., 2014; Gong and You, 2015; Gao and You, 2017], and sustainability [Garcia and You, 2016].

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780444639653502269

Proceedings of the 9th International Conference on Foundations of Computer-Aided Process Design

Jian Gong, Fengqi You, in Computer Aided Chemical Engineering, 2019

Challenge # 3: Handling uncertainties and resilience using robust optimization leveraging machine learning methods

Uncertainties arise across an energy system. For example, feedstock compositions, product demands, and material prices are common parameters that can be inherently uncertain or fluctuate over time [Gao et al., 2017d]. Since the optimal design decisions of an energy system can be suboptimal or even infeasible if the parameters are subject to uncertainty, it is crucial to handle uncertainties and obtain robust solutions with efficient modeling and optimization techniques. There are a number of systematic methods for handling uncertainties in design optimization problems, including stochastic programming [Kall et al., 1994], robust optimization [Ben-Tal et al., 2009], fuzzy programming [Zimmermann, 1978], and chance-constraint methods [Charnes et al., 1959]. One important group of process design problems under uncertainty have been developed to maximize flexibility [Grossmann et al., 1978; Halemane et al., 1983; Swaney et al., 1985]. Research along this vein has been focused on various types of uncertainty parameters [Pistikopoulos et al., 1990; Rooney et al., 2001; Straub et al., 1990] and efficient solution strategies [Grossmann et al., 1987]. Another active research area for process design under uncertainty employs two-stage stochastic programming with scenario-based recourse [Acevedo et al., 1998; Hene et al., 2002; Li et al., 2014; Pistikopoulos et al., 1995]. However, handling uncertainties with these methods is limited. Although a general framework for optimal design under uncertainty has been proposed [Halemane & Grossmann, 1983], existing strategies for process flexibility problems requires fixed process designs. Stochastic programming, in contrast, allows simultaneous optimization of design variables under uncertainty, but is useful only when there are a reasonable number of scenarios. Increasing the number of scenarios to improve solution accuracy can significantly enlarge the problem size and lead to intractability. Furthermore, stochastic programming requires probability distribution functions that are usually difficult to acquire. Therefore, there is a need to apply tractable methods that allow simultaneous optimization of design variables. In contrast, robust optimization does not require any knowledge of probability distributions, and it guarantees feasibility of all possible realization of uncertainty. However, traditional robust optimization approaches treat all decisions as “here-and-now”, which is a major limitation of this technique. Multistage, or adjustable, robust optimization has been proposed and studied for uncertainty in sequential decision-making processes by postulating a functional dependence of decision variables on parameter realizations [Ben-Tal et al., 2004]. Nevertheless, it remains a challenge to understand uncertainty types, address uncertainties in integrated optimization models, and develop tailored solution strategies for efficient global optimization in energy system problems.

Handling uncertainties in sustainable design and synthesis of energy systems leads to another research challenge, which is comprised of three parts. First, we must fully understand and identify the uncertainties in energy systems. Uncertainties can emerge within an energy system, in the nexus of the energy system and the externalities, and also outside the energy system. Besides, uncertainties can also be involved in process models, techno-economic assessment, and life cycle sustainability analysis. Without a sound understanding of uncertainties, it is impossible to capture and handle them in a systematic manner. Second, uncertainties need to be quantified effectively. The easiest uncertainty set requires simply an upper and a lower bound which however can be difficult to obtain. Therefore, properly defining uncertainty sets could be quite challenging and will require substantial, careful, and iterative research efforts. Finally, we seek to incorporate uncertainties into the superstructure optimization models and employ efficient solution algorithms to determine the optimal process decisions. Robust optimization is concerned with the worst-case optimal solutions, which are of great interest in designing resilient energy systems. Moreover, robust optimization allows tractable solution methods with certain types of uncertainty sets [Shi et al., 2016]. This attractive feature cannot be exploited by single-stage superstructure optimization problems for energy systems; classic solution methods for robust optimization require the dual formulation of robust counterpart problems, which can be impossible to formulate for all MINLP problems. More importantly, traditional robust optimization approaches consider all decisions as “here-and-now”, but the design and operational decisions of a sustainable energy system are made sequentially in reality. Two-stage adaptive robust optimization approach has been proposed as an efficient and tractable way to handle uncertainty in sequential decision-making problems in sustainable design and synthesis of energy systems [Ben-Tal et al., 2004]. There are also efforts on combining robust optimization and stochastic programming for tackling multi-scale uncertainties [Yue & You, 2016]. Nevertheless, challenges remain in a proper multi-stage robust optimization model for process design and synthesis problems, and a solution strategy for the resulting optimization problems. Some recent works [Gong et al., 2016] proposed efficient solution algorithms for robust MINLP problems in energy systems design, but general purpose algorithm remains a challenge. Since robust mixed-integer linear programs are generally computationally expensive, there is a need of developing efficient solution algorithms for solving large-scale multistage robust MINLP problems.

Besides, recent advances in data-driven robust optimization methods [Bertsimas et al., 2017; Ning et al., 2017a; Shang et al., 2017] should be leveraged to address some of these challenges on energy systems design under uncertainty. Some of these tools and methods have been successfully applied to relevant problems on production scheduling [Ning et al., 2017b, 2018c], process network design and planning [Ning et al., 2018a, 2018b], process control [Shang et al., 2019], and supply chain optimization [Gao et al., 2019]. There are clear direct application and promising venues to use and further develop these techniques to address energy system design under uncertainty.

Another important research area regarding uncertainties is resilience in sustainable design and synthesis of energy systems. Resilience was first proposed in ecology [Holling, 1973] and it was concerned with the persistence of systems regarding unexpected change and disturbance. Later resilience was introduced and actively studied in various infrastructure systems, such as electric power systems [Ouyang et al., 2012], transportation systems [Bocchini et al., 2012], water delivery systems [Chang et al., 2004], health care systems [Bruneau et al., 2007], supply chains [Garcia-Herreros et al., 2014], etc.

Although resilience is clearly defined as the ability to recover quickly from disruptions [Dinh et al., 2012], it is rarely considered in process design and synthesis of energy systems, and there remains a research gap to develop a general framework for designing resilient energy systems. To establish such a framework, we must first quantitatively answer three questions: [1] what types of disruptions might impact an energy system, [2] how often do these disruptions happen, and [3] how do these disruptions influence the normal operation of an energy system. The answers to these questions are at the core of building a fragility model and a restoration model of each equipment [Ouyang et al., 2014]. However, answering such questions is challenging in practice because it may involve extracting large amounts of data from literature and employing sophisticated statistical and probabilistic tools for data analysis. Once we have a thorough understanding of disruptions, a second challenge of proposing multi-stage disruption-prevention strategies will emerge. Chronologically, a resilient process should enhance resistance to disruptions, reduce failure possibilities, prepare redundant units at the design stage, and ensure efficient recovery response and effective recovery sequence after disruptions take place [Ouyang et al., 2012]. Incorporating these strategies into mathematical models is never trivial, since the models need to account for critical operating conditions, failure probability, restoration time/function, and money expense of various equipment units at different damage levels, as well as response time of repair contractors. Before developing a framework, we must integrate the above mathematical models with superstructure optimization models for sustainable process design. In concert with the second challenge, a two-stage model should be enforced. We also need recovery sequence constraints and linking constraints to associate resilience models with superstructure optimization models. In addition, a stochastic model may be employed if a disruption of an energy system is induced by breakdowns of individual units, and the possible combinations of these breakdowns form a set of scenarios [Garcia-Herreros et al., 2014; Zhao et al., 2019]. A multi-objective optimization model may also be considered and one of the objective functions optimizes a tailored resilience metric [Bocchini & Frangopol, 2012]. The key challenge of this model, however, lies in defining a reasonable time-dependent performance function. Unlike network systems, streams in an energy system are not interchangeable due to composition differences. Therefore, a process system is intrinsically more vulnerable, indicating the need for future research into a general framework for resilient energy systems [Gong et al., 2018b].

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780128185971500345

An OpenSim guided tour in machine learning for e-health applications

Mukul Verma, ... Harpreet Singh, in Intelligent Data Security Solutions for e-Health Applications, 2020

2 State of the art

An OpenSim-based platform allows anyone to simply develop the model and take a look at new machine learning management methods with physiologically musculoskeletal models [2]. OpenSim can be used for any purpose inclusive of nonprofit business applications with Apache License 2.0. Simulations built by the biomechanics community can be analyzed, changed, enhanced, and tested through multiinstitutional collaboration on the base provided by OpenSim. The OpenSim GUI is coded in Java and the core is coded in C ++. OpenSim makes it possible to create contact models, muscle models, and many more exciting models.

Models can use these codes by installing the plugins [which are shared by users], and there is no necessity to change or recompile these codes. Users can create their own models and simulations from the GUI and can also analyze and reference the existing models and simulations. The programs are also freely and anonymously accessible on GitHub, and contributions are readily accepted.

Nowadays, machine learning models use training and testing recommendations to modernize biomechanical data analysis. The potential of several other approaches like deep learning is also used to produce knowledge of human movements. To increase the effectiveness of research in biomechanics, the sharing of data and cross-training are essential. Traditional approaches like inferential statistics are overcome using modern machine learning methods, data mining, and predictive modeling. These approaches provide different advanced solutions compared to traditional ones. Machine learning approaches are trained and interpreted based on well-known reporting standards in comparison with traditional statistical tools. Predictive modeling is used for futuristic accurate predictions by providing an outcome with respect to its output response data with the help of functions. For example, the status of a disease is input and kinematic waveforms are output. In the biomechanics movement field, the use of machine learning techniques is expanding, but evaluation of these studies remains difficult. The use of machine learning approaches is continuously increasing due to large data availability to ensure that valid, as well as reproducible, conclusions are provided. So, good practices in reporting and conducting research that connects biomechanics and machine learning are still needed. Moreover, machine learning methods increase quality and propose further visible standards for future research in this area. The biomechanics and computer science communities have started to work together to model the neural control of movements using reinforcement learning. These joint efforts and advancements will act as a bridge to prepare physiologically accurate biomechanical models [2]. Mostly, reinforcement learning in machine learning is used where environmental actions train the agent and the agent learns to implement a task to increase its output. For example, in OpenSim, an agent of a musculoskeletal model learns to walk by selecting actions of muscle excitations [8]. This method is also used to play the game “Go Well” by training a program; as a result, the program can outperform other humans in gameplay without any input [9]. Future research in this field will enhance the understanding of human neural movements. Hence, it will eventually improve human-machine interfaces. It was discovered that deep reinforcement learning techniques, even with their high processing price, could be strongly utilized as an escalation technique for synthesizing physiological motion in high-dimensional biomechanical systems [10]. Reinforcement learning is a part of machine learning, where an operator learns the optimal policy for performing sequential decision making without complete knowledge of the environment [11,12]. The operator explores the atmosphere by taking action and edit the policy to maximize the reward in RL techniques. General controllers for movement use the latest reinforcement learning techniques. These techniques significantly reduce the need for a user to manually tune the controllers as compared to the formerly discovered gait controllers. As an example, controllers for the locomotion of difficult humanoid models can be trained by reinforcement learning [13,14]. Though these strategies have found solutions that did not have domain-specific data, the ensuing motions were not realistic. One of the key reasons for not using these models is that biologically correct actuators were not used by these models. Its main objectives were [3] to:

Use reinforcement learning to resolve issues in medical management and to promote the free use of tools in reinforcement learning analysis [the physics simulator, the reinforcement learning environment, and the competition platform that the user can utilize to run the challenges].

Encourage reinforcement learning analysis in computationally complicated environments, with noisy and highly dimensional action areas, relevant to real-life applications.

Bridge biomechanics, neurobiology, and computer science communities.

However, the next section discusses the basic musculoskeletal elements and OpenSim capabilities.

2.1 Basic musculoskeletal elements and capabilities of OpenSim

Fig. 2 shows the fundamental components of musculoskeletal simulation in OpenSim. A musculotendon actuator is a significant component of any human musculoskeletal system. These actuators are supposed to be frictionless, massless, extensible strings that attach to and wrap around other structures [15]. Musculotendon models are necessary components of muscle-driven simulations; however, the calculation speed and biological correctness of these models have not been effectively evaluated yet. Muscle-driven simulations depend on computational models of musculotendon dynamics. Generally, parameters like peak isometric muscle force and its corresponding fiber length, muscle’s intrinsic extreme shortening velocity, pennation angle, and tendon slack length are specified to customize the generic model for specific muscles. Cross-bridge models and Hill-type models are the two main categories of musculotendon models. Basically, mechanisms for muscle contraction can be explained with the help of cross-bridge models, which also describe the interaction between actin and myosin filaments. It was supposed originally that myosin is attached to actin and makes a cross-bridge among the filaments. After being attached, cross-bridges would cause slipping of the actin filament and the production of force. The advantages of cross-bridge models are that they are derived from the basic structure of muscles and measure parameters, which are challenging to measure. But due to computational cost, the main focus is on Hill-type models. Hill [16,17] was interested in modeling the external behavior of muscle rather than its underlying physiology. In doing so, a lumped parameter approach was taken, whereby the muscle was modeled as a damped active force generating a contractile element with a parallel elastic element representing its passive elasticity, attached to a series elastic element to represent the tendon. The capacity of muscles to generate force was dependent on length and velocity such that actin and myosin filaments have maximum overlap at an optimal length, producing the greatest capacity to generate force. Force drops as the length deviates from this optimal position in either direction.

Fig. 2. Simulation of musculoskeletal elements used in OpenSim.

The action of movement originates from a complicated coordination/composition of the muscular, sensory, skeletal, and neural systems. Human and animal movements can be analyzed and predicted through the models inculcated in OpenSim. Neural commands to muscles sent as excitations could be calculated from experimental knowledge [e.g., electromyography] or controller models. The force-length and force-velocity values are embodied by OpenSim’s Hill-type musculoconnective tissue models, where muscle forces are determined from excitations. OpenSim gives us the freedom to analyze all the different muscle geometries, and also describe muscle orientation parameters and contraction dynamics to be changed by supported tentative knowledge. Users are able to create forward simulation and resolve muscle moments and forces through which the motion is detected. This process of resolving forces/moments from a detected motion is called inverse simulation. This is possible due to its Simbody engine for multibody dynamics and a number of other solvers/integrators. Contact models are included in the Simbody engine, which helps in the foregoing simulations and analyses.

2.2 OpenSim capabilities

OpenSim has numerous capabilities, for example, it can calculate variables that would take a lot of resources to work out by experimentation, such as forces produced by muscle and also expansion and contraction of tendons while performing a movement. Movements such as kinematic variations in human gait through walking [inclined as well as loaded] or different motor control movements from models can be predicted using OpenSim [4]. Basically, several core areas are covered in OpenSim capabilities, as shown in Fig. 3.

Fig. 3. Capabilities of OpenSim used in various applications.

1.

Biomechanical models could be built, interrogated, and manipulated by users [5].

2.

The computed muscle control [CMC] tool in OpenSim adjusts the model to be dynamically consistent and to calculate the excitations of each muscle during motion [18,19]. Before the muscle activations can be computed the inconsistencies that appear during calculations of the system dynamics should be minimized. These residual errors normally occur because of measurement errors, model errors, and/or errors in previous calculations. In OpenSim, these errors are mitigated by adjusting the center of mass of a model segment and allowing for adjustments to the kinematics of motion [20]. Errors should be adjusted after the model has been scaled, and only after it has gone through both the inverse kinematics [IK] calculations. After the model has been adjusted, CMC can be run. From this, the muscle activations that cause the dynamics of motion are determined. The results of the CMC 11 calculation are displayed visually when the motion sequence is played back, by virtue of the colors of the muscles changing as a function of their activations.

3.

OpenSim may be used to simulate and analyze musculoskeletal dynamics as well as fasciculus management. Modeling helps learners/researchers to pursue studies that are troublesome to perform via experimentation.

4.

OpenSim’s plotting tool allows almost any parameter of models, motions, or other calculated values to be plotted with respect to any other value. This is obviously extremely useful in visualizing data. The plotting tool is fairly intuitive and easy to use, and data from multiple models or calculations can be plotted simultaneously.

5.

Different movements and variations in changed conditions can be predicted by OpenSim. This is done by exclusively applying principles of fasciculus management and dynamic simulation. All this can be done while carrying out zero experiments. An in-depth understanding of the coordination between the muscles in the process of inclined and loaded walking must be available to understand the limits of reflexes during landing [7] and approaches of the optimum device model to reinforce the jumping performance [21] by using OpenSim.

6.

The forward dynamics tool in OpenSim allows dynamics to be recalculated if the muscle excitations from CMC are edited. A separate tool is available for editing the muscle excitations. This is useful for viewing the effects of muscle excitations on motion.

7.

Researchers can form and share their self-made models [22,23], simulation tools, and numerical strategies [24–27] suitable for the capabilities of the application with the help of the standard and extensible architecture provided by OpenSim.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780128195116000030

A survey of robotic motion planning in dynamic environments

M.G. Mohanan, Ambuja Salgoankar, in Robotics and Autonomous Systems, 2018

2.5.5 Markov decision processes [MDP] and partially observable Markov decision process [POMDP]

MDP are approaches for sequential decision-making when outcomes are random and uncertain. The MDP model consists of decision period, states, actions, rewards, and evolution probabilities [74]. The process is given briefly below:

[1] At time t, a particular state x of the Markov chain is observed.

[2] After the observation of the state, an action, say u, is taken from a set of possible decisions [different states may have different sets of decisions].

[3] An immediate gain [or loss] r[x,u] is then incurred according to the current state x and the action u taken.

[4] The evolution or transition probability p [x′|u,x] is then affected by the action u.

[5] As the time t increases and another transition occurs, all the steps stated above are repeated.

The value iteration is done by the recursive formula

[12]V T=ϒmaxu[r[x,u]+∫ VT−1[x′]p[x′|u,x]dx′],with V1=ϒmax ur[x,u]

where ϒ is a problem-dependent discount factor which takes values in [0 1].

POMDP has been formally given in [75] as follows:

Let X⊂Rn be the space of all likely states x of the robot, U⊂Rm be the space of all likely control inputs u of the robot, and ZϵRk be the space of all likely sensor measurements z that the robot may receive. General POMDPs accept as input a dynamics and observation model, given here in probabilistic notation:

xt+1∼p[xt+1|xt,ut];zt∼p[zt|xt]

where xt⊂X, ut⊂U, and zt⊂Z are the robot’s state, control input and measurement at stage t, respectively.

The belief b[xt] of the robot is defined as the distribution of the state xt, given all past control inputs and sensor measurements.

b[xt]=p[xt|u0,…, ut−1;z1,…,zt]:

Given a control input ut and a measurement zt+1, the belief is propagated using Bayesian filtering:

[13]b[xt+1]=ηp[zt+1|xt+1] ∫p[xt+1|xt,u t]b[xt]dxt;

where η is a normalizer independent of xt+1. Denoting belief b[xt] by bt, and the space of all possible beliefs by B⊂{X→[01]}, the belief dynamics defined can be written as a function β: B×U×Z→B,

bt+1=β[b t;ut;zt+1].

The POMDP problem is to find a control policy Πt: B→U for all 0≤t

Chủ Đề