Air combat maneuvers pursuit curve

3/2/2023

We demonstrate that our approach, based on the Deep Q-Learning (DQL) algorithm, enhances important radar metrics, including SINR and bandwidth utilization, more effectively than policy iteration or sense-and-avoid (SAA) approaches in a variety of realistic coexistence environments. The radar learns to vary the bandwidth and center frequency of its linear frequency modulated (LFM) waveforms to mitigate mutual interference with other systems and improve target detection performance while also maintaining sufficient utilization of the available frequency bands required for a fine range resolution. In this paper, dynamic non-cooperative coexistence between a cognitive pulsed radar and a nearby communications system is addressed by applying nonlinear value function approximation via deep reinforcement learning (Deep RL) to develop a policy for optimal radar performance. Experimental results indicate that the proposed Deep RL approach significantly improves radar detection performance in congested spectral environments compared to policy iteration and SAA.

The practicality of the proposed scheme is demonstrated through experiments performed on a software defined radar (SDRadar) prototype system. The DQL-based approach is also extended to incorporate Double Q-learning and a recurrent neural network to form a Double Deep Recurrent Q-Network (DDRQN), which yields favorable performance and stability compared to DQL and policy iteration. We demonstrate that this approach, based on the Deep Q-Learning (DQL) algorithm, enhances several radar performance metrics more effectively than policy iteration or sense-and-avoid (SAA) approaches in several realistic coexistence environments. The radar learns to vary the bandwidth and center frequency of its linear frequency modulated (LFM) waveforms to mitigate interference with other systems for improved target detection performance while also sufficiently utilizing available frequency bands to achieve a fine range resolution. This work addresses dynamic non-cooperative coexistence between a cognitive pulsed radar and nearby communications systems by applying nonlinear value function approximation via deep reinforcement learning (Deep RL) to develop a policy for optimal radar performance. Case scenarios and simulation experiments illustrate the rationality and effectiveness of the method which can provide effective solutions for commanders to select key targets. Fourth, with the optimal cost-effectiveness ratio as the objective function, the evaluation criterion function model is established. The third is to construct a cascading failure model of the target system through the definition of triples (results of strike action, target unit status, and failure influence relationship between target units). The second is a node network value model based on the potential field theory. The first is a value model of the node itself constructed by comprehensively considering the relationship between the target and the war, the importance of the target in the system, and the threat of the target. The key target selection model consists of four parts. This study constructs a target combat system network based on the complex network theory and maximum entropy principle, establishes a key target selection model based on this network, and then determines the use of medium and long-range weapons to strike the key target unit in the enemy combat system network. However, the target system often has multiple target units and which target units to specifically attack becomes the primary issue. In modern information warfare, medium and long-range weapons with high strike precision are often used to ensure stable, accurate, and ruthless conditions.

0 Comments

Air combat maneuvers pursuit curve

Leave a Reply.

Author

Archives

Categories