节点文献
嵌入式计算机控制系统容错策略研究
Research on Fault-tolerant Technology of Embedded Computer Control System
【作者】 王平;
【导师】 杨根庆;
【作者基本信息】 中国科学院研究生院(上海微系统与信息技术研究所) , 微电子与固体电子学, 2004, 博士
【摘要】 计算机控制技术已广泛应用于宇航、航空、各种工程设计和科学研究、各种过程控制和管理、国防、军事以至日常生活领域。由于计算机的广泛应用,其可靠性成为一个非常突出的问题,许多应用场合都要求计算机能长期稳定、安全、可靠地运行。计算机控制系统的容错设计主要是指针对故障的专门设计方法,它主要包括故障的检测和诊断、硬件的可靠性设计、软件的可靠性设计、硬件不可恢复或无冗余备份条件下的控制系统降级运行等。本文主要以空间飞行的星载计算机控制系统为研究对象,围绕提高控制系统的可靠性在传感器数据采集、计算机软件和硬件以及容错设计的验证等方面主要可能采取的措施开展工作。 传感器正确采集输出数据是控制系统正常工作的基础。传感器故障数据的产生主要有两个方面:一是受各种干扰输出不正确的数据,一是传感器自身发生故障不能获得正确数据。为了检测出传感器的故障,本文使用了数值计算方法进行传感器输出数据的预测,并根据预测结果与实际传感器输出结果的比较得到传感器的故障概率。由于传感器输出数据发生跃变的情况在空间环境是时常出现的,因此传感器的故障概率并不能指出传感器确实发生了故障,要准确判断传感器是否确实发生故障,需要采用多源数据融合的方法。DS证据理论提供了这种融合的判断方法,使得我们可以根据信息融合的结果判断传感器是否确实发生故障,但DS证据理论在实际应用中有时并不能完全符合实际结果,为此,在分析了DS证据理论存在的缺陷后,本文提出了改进的RDS证据理论,该理论的主要改进是对DS组合证据方法增加了相关系数和可靠性加权系数,在去除了证据之间的相关性并增加了证据的可靠性系数后,RDS证据能正确地对多源数据进行融合,从而获得正确输出结果。 软件的容错设计是实现计算机控制系统高可靠性的一个重要方面,为了实现软件的容错,避免干扰造成影响,可以使用故障检测、故障恢复、破坏估计、故障隔离、继续服务等技术,利用指令冗余、软件陷阱、软件看门狗等方法来使发生故障的系统恢复正常运行。由于在程序设计过程中可能人为地引入各种错误,因此卫星星载软件要求能实现重组,本文就此问题提出一种基于组件的运行中嵌入式软件的重组方法,在使用组件化、模块化技术的基础上,利用恢复块、多版本以及软件注入等技术实现软件的重组。由于目前星载计算机控制系统在空间环境面临的主要问题是辐射造成的单粒子事件和的影响,因此,本文提出一种软件实现容硬件错的设计方法。该方法基于数据复制来完成,在程序产生新数据的各个运行阶段,都对源数据和输出数据进行复制比较,在进行多次等价运算的基础上,利用结果数据一致性比较的方法来得出运算结果。为了解决运算结果数据一致性比较的问题,本文还提出了动态模糊聚类的数据一致性比较方法,以得到正确的输出结果。 容错设计的正确与否需要通过容错验证来完成,本文在卫星电源控制系统的容错设计验证中采用了软硬件故障注入的验证方法,通过建立电源系统的故障分析树,分析其故障模式,使用硬件模拟器和软件仿真器产生故障,以验证电源控制系统的软硬件容错设计是否正确。本文最后概要介绍了作者所参与的创新一号小卫星计算机系统的软硬件设计及测试和在轨运行情况。 本文的主要贡献有下面几个方面:一是提出利用数值方法进行传感器检测值的预计,并以此为基础得到了传感器的故障概率;二是对信息融合的重要理论—D卿rnps权沈.Sh副比r理论进行了重要的改进,提出了可靠的DS理论(RDs),并用实例证明了RDS理论应用于实际中的正确性;三是提出了软件容错和嵌入式软件重组的流程和方法:四是利用数据复制的方法实现了软件的容硬件错能力,该方法结合软件EDAC可以实现商业器件在星载计算机上的可靠应用:五是提出了动态模糊聚类的数据一致性校验方法,以实现对多个输出的正确性的判断。
【Abstract】 Computer control technology applied in many fields, such as space navigation* aviation, process control, engineering design, management, military, etc. Because computer use widely, its reliability is becoming an important problem. Fault-tolerant design of computer control system includes all kinds of method that can reduce effect of fault, such as fault detect and diagnose (FDD), dependable hardware technique, and dependable software technique, degrade run when hardware cannot resume or there is no redundant part, etc. The thesis study mainly on computer control system of satellite, and the central issue is how to improve the reliability of control system by software techniques.The operation of control system needs correct data that come from sensors, then the research focus on sensor fault detection and diagnose at first. The incorrect data root in two ways, one is come from all kinds of disturbances, and another is sensor fault. For fault detection of sensors, the thesis predicts the next output data of sensor by numeric algorithm, and then gets the fault probability of sensor by compared the predict data with actual data which sensor acquire. Because the case is observed frequency that sensor output data jump sharply, so the probability can’t estimate the sensor is fault indeed, it need multi-source data fusion to judge fault DS evidence theory is the appropriate method, so we can judge sensor fault by multi-source data fusion, but the traditional DS evidence theory cannot accord with the truth in some case. After the analysis to the bug of DS evidence theory, the thesis put forward RDS reliable evidence theory that based on DS theory, RDS theory introduce reliability weight coefficient and correlation coefficient, and then RDS theory can get correct output by this means.Fault-tolerant software is important to implement the high reliability of computer control system, we can apply some methods to avoid disturbance and implement fault-tolerance, such as instruction redundancy, software trap, watch timer dog of software, etc. and then system can resume. Designers may produce all kinds of error, and then the software of satellite computer should be reconfigurable. The thesis provide the reconfigurable technique which base on component for embedded software, on the basis of module technique, we can achieve the reconfiguration of embedded software by some way, such as recovery block, N-version and software injection, etc. The main problem that satellite computer system encounters is the SEUs that result from radiation, so the thesis provides a design method that can tolerate temporal hardware error. The method based on data copy, at any point that program produce new data, the method reproduce and compare the source data and output data, the last output is the consensus data mat produced by every algorithm. For rinding the consensus data, the thesis put forward a method which base on dynamic fuzzy cluster algorithm.The validity of fault-tolerance need verify. The thesis verified the fault-tolerance of power control system with the method that injects error of software and hardware. The method build fault analysis tree, and analyze the fault mode, then produce fault by simulator and emulator to verify the fault-tolerance. At last, the thesis introduce the design of computer software and hardware of ChuangXin-1 micro-satellite which author is one of the designer, and the test in orbit of ChuangXin-1.The main contribution of the thesis include several parts. The first, the thesis predict the output of sensor by numeric algorithm, and then analyze the probability of sensor. The second, the thesis improve the Dempster-Shafer evidence theory, and put forward reliable DS theory (RDS), and then prove the correct of RDS theory by a case. The third, the thesis provide the flow for the reconfiguration of embedded software. The fourth, the thesis implement fault-tolerance of hardware error with software by data copy, integrate with software EDAC, the method can achieve reliable satellite computer with COTS. The fifth,
【Key words】 embedded computer; fault-tolerant; software; control system; reliability; DS evidence theory;
- 【网络出版投稿人】 中国科学院研究生院(上海微系统与信息技术研究所) 【网络出版年期】2005年 01期
- 【分类号】TP273.5
- 【被引频次】29
- 【下载频次】2303
- 攻读期成果