Self-test

Things go wrong. Electronic components die. Systems fail. This is almost inevitable and, the more complex that systems become, the more likely it is that failure will occur. In complex systems, however, that failure might be subtle; simple systems tend to just work or not work.

As an embedded system is “smart”, it seems only reasonable that this intelligence can be directed at identifying and mitigating the effects of failure …

Self-testing is the broad term for what embedded systems do to look for failure situations. I suspect that many conference papers have been presented on this topic [or could be – now, there is a thought …]. It would even make the subject of a book or, at least, an extended series of articles. But I just want to identify some of the key issues.

Broadly, an embedded system can be broken down into 4 components, each of which can fail:

  • CPU
  • Software
  • Peripherals
  • Memory

CPU failure is not too common, but is far from unknown. Unfortunately, there is very little that a CPU can do to predict its own demise. Of course, in a multicore system, there is the possibility of the CPUs monitoring one another.

Software failure is obviously a possibility and defensive code may be written to avoid some possible failure modes. This is quite a large subject that I will return to another day, but I did discuss some aspects on a post last year. Of course, a bug in the software might lead to a totally unpredictable failure.

Peripherals can fail in many and varied ways. Each device has its own possible failure modes. To start with, the self-test software can check that each peripheral is responding to its assigned address and has not failed totally. Thereafter, any further self-test is very device dependent. For example, a communications port may have a “loop back” mode, which enables the self test to verify transmission and reception of data.

Memory is a critical of an embedded system of course and is certainly subject to failure from time to time. Considering how much memory is installed in modern systems, it is surprising that catastrophic failure is not more common. Like all electronic components, the most likely time for memory chips to fail is on power up, so it is wise to perform a comprehensive test then, before vital data is committed to faulty memory.

If a memory chip is responding to being addressed, there are broadly two possible failure modes: stuck bits [i.e. bits that are set to 0 or 1 and will not change]; cross-talk [i.e. setting/clearing one bit has an effect on one or more other bits]. If either of these failures occurs while software is running, it is very hard to trace. The simplest test to look for these failures on start-up is a “moving ones” [and “moving zeros”] test. The logic for moving ones is simple:

set every bit of memory to 0
for each bit of memory
{
   verify that all bits are 0
   set the bit under test to 1
   verify that it is 1 and that all other bits are 0
   set the bit under test to 0
}

A moving zeros test is the same, except that 0 and 1 are swapped in this code.

Coding this test such that it does not use any RAM to execute [assuming start up code is running out of flash] is an interesting challenge, but most CPUs have enough registers to do the job.

Of course, such comprehensive testing cannot be performed on a running system. A background task of some type can carry out more rudimentary testing using this kind of logic:

for each byte of memory
{
   turn off interrupts
   save memory byte contents
   for values 0x00, 0xff, 0xaa, 0x55
   {
      write value to byte under test
      verify value of byte
   }
   restore byte data
   turn on interrupts
}

These testing algorithms, as described, assume that all you know about the memory architecture is that it spans a series of [normally contiguous] addresses. However, if you have more detailed knowledge – which memory areas share chips or how rows and columns are organized – more optimized tests may be devised. This is desirable, as a slow start up will impact user satisfaction with a device.

A final question is what to do if a failure is detected. Of course, this will be different for every system. Broadly, the system should be put in a safe state [shut down?] and the user advised.

Post Author

Posted August 27th, 2013, by

Post Tags

, ,

Post Comments

1 Comment

About The Colin Walls Blog

This blog is a discussion of embedded software matters - news, comment, technical issues and ideas, along with other passing thoughts about anything that happens to be on my mind. The Colin Walls Blog

Comments

One comment on this post | ↓ Add Your Own

Commented on 30 August 2013 at 21:33
By Links da semana #4 | Blog do Sergio Prado

[…] Colin Walls fala sobre como desen­volver sis­temas embar­ca­dos capazes de iden­ti­ficar fal­has através de testes automáti­cos.Self-test […]

Add Your Comment

You must be logged in to post a comment.

Archives

October 2014
  • I am a star of TV and video
  • The Internet (or not) of Things
  • The non-ARPS
  • Embedded software articles – from simulation to optimization
  • Who needs a taxi?
  • Real time or not? What operating system makes sense?
  • Photography and getting my ARPS (or not)
  • Another ARM conference
  • “Six of the Best” – Famous dead people
  • September 2014
  • ARM TechCon – See you there?
  • “Six of the Best” – Cities
  • Multiple constructors, the :: operator and memory speed – more questions answered
  • “Six of the Best” – iPad apps
  • A hypervisor on a multicore system
  • Do I spit it out? A wine tasting 101
  • Embedded software articles – from incrementing in C/C++ to using dynamic memory
  • Pulling the cork
  • Default parameters, C++ determinism and other C++ questions
  • August 2014
  • How about a “Penny Lick”? Ice-cream the Victorian way
  • Multicore Linux, DO-178B and RTOS performance
  • Two steps forward, one step back – learning from setbacks
  • Embedded software articles – from assembly language to software IP
  • The 5:2 way of eating
  • Undefined behavior and other delights of (bad) C programming
  • July 2014
  • Deafness, sign language and babies
  • Quick and dirty code checking and execution
  • Words on tap – are streaming books next?
  • Multicore systems: heterogenous architectures – untangling the technology and terminology
  • Words – new, old and odd
  • Another programming language survey – what are you using?
  • What to do with all those pictures
  • Uniqueness of C++ methods and class member variables
  • The Big Mac question
  • June 2014
  • C++ – yet more Questions and Answers
  • IQ – myth or meaningful?
  • C++ – more Questions and Answers
  • Asia [the band, not the continent]
  • Embedded software resources: a reading list for engineers
  • Doing a good turn
  • struct vs class in C++
  • May 2014
  • RIP Gordon Cameron
  • Problems with pointers: out of scope, out of mind
  • 1st world and 3rd world technology
  • C++ Exception handling continued – words from an insider
  • Internet shopping and customer service
  • C++ Q&A
  • Exception handling in C++ – developers are wary
  • 1000 words
  • Dynamic memory in real time systems – a solution?
  • Genetic modification – where will it all end?
  • April 2014
  • C++ for embedded – your input needed
  • The Power Distance Index and safe flying
  • Wi-Fi, ZigBee, Miracast and Bluetooth: wireless networking on the move – without a router
  • Are you good enough?
  • USB 3.1 – more speed, more power and new connectors
  • California drought: how water is wasted in huge volumes
  • How was EE Live! for you?
  • Teaching kids to program – why would we want to do that?
  • March 2014
  • See you at EE live!
  • Alien technology?
  • A date at DATE [the software perspective on virtual prototyping]
  • Is it normal to be normal?
  • Selecting a CPU
  • Set in stone
  • Embedded software engineering priorities – a response
  • Eggheads
  • Embedded World – “This is not EDA”
  • February 2014
  • Astrology
  • Power Management, Multicore: Embedded World 2014
  • An RTOS with a GUI
  • Me and my iPad
  • Google hangout on IoT
  • Embedded software engineering priorities
  • I am an ambassador
  • Embedded systems in cars – challenges and solutions
  • January 2014
  • A time for celebration?
  • Low power multicore from a software designer’s perspective
  • Learning to swim
  • Low power modes
  • The 4th dimension
  • Safety critical sensors in cars
  • Goose bumps
  • Selecting an embedded operating system
  • December 2013
  • Christmas traditions
  • Using an unsigned integer to store a pointer in C
  • How selfish are you? Is there such a things as altruism?
  • GENIVI Diagnostic Log and Trace
  • Gravity – a memorable movie?
  • Embedded virtualization: Out-of-the-Box and into-the-fire?
  • November 2013
  • The harpist
  • Hypervisor applications
  • Back to Berlin
  • Conferences in perspective
  • Mars again
  • Embedded hypervisors
  • Stockholm taxi rip-off – a warning
  • ECS & IP-SoC
  • ARM Tech Con in Sixteen Hours: Top Five Moments
  • October 2013
  • The 100 Year Old Startup
  • ARM TechCon
  • How does this shape taste?
  • Busy, busy, busy
  • Donating blood
  • The demise of the reset button
  • (It’s not) All about me
  • The next Big Thing
  • The Week Plan
  • September 2013
  • Embedded file systems
  • There was no big bang
  • Smart Energy panel
  • How old is your car?
  • On vacation
  • Ah! Lua
  • Knots and splices
  • Software integrity testing
  • August 2013
  • Forgetting stuff
  • Self-test
  • How to live longer
  • big.LITTLE
  • Is it Summer yet?
  • Resistive RAM
  • No calls please!
  • Measuring scheduling latency
  • Careers and ambition
  • July 2013
  • An interview
  • Lego and the Pyramids
  • Programming languages and prize winners
  • Shopping woes
  • Munich conferences
  • The Nun’s Prayer
  • Three degrees of freedom
  • Water
  • How long is a piece of string?
  • June 2013
  • QR codes
  • Which embedded programming language?
  • Which way is up?
  • SEP 2.0
  • Genetic art
  • Device Firmware Upgrade through USB
  • My genes?
  • Non-intrusive debug
  • May 2013
  • Two heads are better than one
  • Using an SMTP client
  • First non-contact
  • Book review [part 1]
  • Everyday rhetoric
  • Embedded education
  • Gone flying
  • Hardware and software development in synch
  • Why is traveling so hard?
  • April 2013
  • After Design West
  • Great expectations
  • Design WEST
  • Get lost!
  • What is an FPGA?
  • How big is a sheet of paper?
  • Debugging with printf() or not …
  • New thoughts on Evernote
  • Power management webinar
  • March 2013
  • Time annoyance
  • Reading the meter
  • Integer money
  • Endianness
  • An intimate evening with Al
  • Counting on your fingers
  • Getting sorted
  • Judging distance
  • Innovate!
  • Wildebeest
  • February 2013
  • Embedded World 2013
  • A 1000 days with an iPad
  • Why C is faster than assembly
  • Chocolate is good
  • Reentrant write-only ports
  • Losing the penny
  • More on low power CPU design
  • January 2013
  • The Hype Cycle
  • Write-only ports in C++
  • An alternative to online dating
  • Designing a low power CPU
  • The mayonnaise jar and the two pints of beer
  • Write-only ports
  • The new Box Brownie
  • OS configuration
  • New Year in Berlin
  • December 2012
  • Holiday time!
  • This cannot continue
  • Not so much of a puzzle
  • Another article
  • Why bother with a DSLR?
  • In the news
  • Back to Mars
  • Less puzzled
  • November 2012
  • Connecting online
  • USB 3.0
  • Rant: insurance and sex discrimination
  • Power and compilers at the NMI
  • Are you shy?
  • Webinars
  • Big Issue!
  • Blinking is good
  • Just Julian
  • October 2012
  • ARM TechCon 2012
  • Repair or replace?
  • A puzzle
  • On the weekend
  • VSIPL++ Standard goes Global
  • A surprise meeting
  • HPEC ‘12
  • Choice = stress?
  • ECS2012
  • September 2012
  • Viva Las Vegas
  • Embedded software tools – then and now
  • Dimensionally intelligent
  • ESC Boston
  • Doh!
  • The floating point argument
  • iPad apps
  • The Power Pyramid
  • August 2012
  • The smallest room
  • Floating point
  • Left to right
  • Evaluation boards
  • 5 + 5 + 5 – 5 + 5 + 5 – 5 + 5 x 0 = ?
  • Curiously embedded
  • Infected by Olympic fever
  • RTOS memory footprint
  • Are you happy?
  • July 2012
  • Why should I care about software standards?
  • I give up
  • Going to sleep
  • Having fun
  • Munich Embedded Developers’ Forum
  • Evernote hints & tips
  • A more powerful phone
  • A very special day
  • HPC
  • June 2012
  • Reinventing the wheel
  • More on low power
  • Power Suckers
  • Too many pixels
  • In a state
  • Multicore thread synchronization
  • Let’s all hold hands
  • Freescale Technology Forum
  • Don’t jump!
  • Measuring interrupt latency
  • May 2012
  • PowerPoint hints and tips #3
  • RTOS in demand
  • Snaps and splicing
  • GTC – mission accomplished
  • Coffee in Italy
  • Measuring RTOS performance – a Web seminar
  • The Italian cuckoo
  • Preparing for GTC
  • Photography rebooted
  • April 2012
  • Who needs a Web server?
  • PowerPoint hints and tips #2
  • New book
  • Video
  • How do you tell the time?
  • In an open-source world, it’s all about integration
  • What are meetings for?
  • Comparing apples with apples
  • Poison yogurt
  • Embedded software performance optimization: Forget about it!
  • March 2012
  • Universal time
  • ESC imminent
  • PowerPoint hints and tips #1
  • Not the Embedded Systems Conference
  • Signs of the times
  • Trends
  • Hail the straw man
  • Embedded World visited
  • Seeing the light
  • February 2012
  • Looking forward to Embedded World
  • Death by PowerPoint
  • A slick UI
  • The first 12 notes of a song
  • IPv6 is really coming
  • Am I fit?
  • More on System-C
  • A random act of kindness
  • January 2012
  • Q&A
  • Who do I think I am?
  • Interview
  • Joining the embedded bandwagon
  • Not going metric
  • Goodbye ICT
  • Collectomania
  • Everything is wireless
  • Crystal ball gazing
  • December 2011
  • Season’s greetings
  • The invisible RTOS continued
  • Stop, thief!
  • The invisible RTOS
  • Good health to you
  • On the road again
  • Culture
  • November 2011
  • System-C for embedded systems programming
  • Who am I?
  • What is the plural of Linux?
  • Data in the cloud
  • Device drivers on SMP systems
  • The death of the CD?
  • Who needs a debugger anyway?
  • How many?
  • October 2011
  • Hardware designers and software
  • Becoming a millionaire
  • Dennis Ritchie and C for embedded
  • How green is my aircraft?
  • The value of software
  • Winter fuel
  • Efficient code – quiz answers
  • The alarm
  • Conference season
  • September 2011
  • All the data in the world
  • Efficient code – a quiz
  • Sonos
  • C libraries
  • A key to success
  • VDC survey
  • And where were you?
  • Embedded expectations
  • Being rich
  • August 2011
  • Stuffing bits
  • Pecha Kucha
  • Web seminar
  • Instrumentation
  • Working with fire and steel
  • Monolithic or not
  • How did we do it?
  • Get packing
  • Monopoly
  • The ideal programming language?
  • July 2011
  • Festival fever
  • Embedded tools – the third way
  • Drawing on the right side of the brain
  • What language?
  • Swifts
  • Data types and code portability
  • Traveling light
  • ARM Development Conference
  • June 2011
  • Giving it away
  • USB 3.0 and data storage
  • Tempus fugit
  • On the move
  • OGWT
  • After DAC
  • How would you like it done?
  • System power
  • Tilting at windmills
  • More on DAC
  • May 2011
  • Memory signature follow-up
  • Handedness follow-up
  • DAC
  • USB power
  • E-reading
  • Web seminar
  • Memory signature
  • Non-human handedness
  • And the winner is …
  • What time is it?
  • ESC Silicon Valley
  • April 2011
  • Do e-books make me read more?
  • USB 3.0
  • Too many mega-XXX
  • Choosing an embedded operating system
  • Fireworks
  • Web seminar
  • USB – class drivers
  • Legal anachronisms
  • Why does power matter?
  • Web seminar
  • March 2011
  • Precise or random?
  • Device registers in C
  • Moderate cycling on the road to Damascus
  • Share and share alike
  • Evernote
  • Why move to C++?
  • Perspectives on a missed opportunity
  • Embedded World 2011
  • Web seminar
  • The parcel that wasn’t stolen
  • February 2011
  • Hiring embedded software engineers
  • Cool Katie
  • A question of honor
  • C++ reference parameters – the downside
  • The happy ending
  • A different angle on multicore
  • Jubilee!
  • Who needs hardware designers?
  • Big vs little
  • January 2011
  • Where does embedded software come from?
  • The King’s Speech
  • C++ – for loops
  • iPad/iPhone iOS 4 closing apps
  • Secret codes
  • A movement
  • Software saves lives
  • Best tech of 2010
  • December 2010
  • Have a drink on me
  • The power of a greeting
  • Heavy elements
  • Embedded Software Engineering Kongress 2010
  • Why grow?
  • More uses for an MMU
  • A one way trip
  • November 2010
  • Embedded software in 2011
  • Friends, Romans and Dunbar’s number
  • Firmly in line
  • In judgment
  • ARM Tech Con 2010
  • Bad food
  • The UI debate
  • And you are?
  • USB – even more
  • October 2010
  • What’s my iPad for?
  • Scandinavian embedded
  • Back to school
  • Go Forth!
  • Up hill
  • The interview
  • Starting a business
  • A busy season
  • September 2010
  • Moving house
  • RTOS deployment
  • Element 13
  • What date is it?
  • USB – real speed
  • The wrong word
  • The one line RTOS
  • Handedness
  • August 2010
  • C++ at fault
  • RPN
  • Embedded Linux – why?
  • Thanks for the memory
  • USB – the need for speed
  • What is stress?
  • Function parameters
  • The wardrobe
  • Chinese embedded
  • July 2010
  • iPad – how I feel
  • Why write code?
  • More laziness
  • Electronics for the sick
  • Can you do the placebo?
  • AMP & SMP revisited
  • iPad – an interim report
  • When compilers do magic
  • Arthur Smith and the Italians
  • The Soviet Empire – my part in its downfall
  • June 2010
  • EDA and embedded software
  • No comment on the World Cup
  • Laziness as a business asset
  • MCAPI – lessons learned
  • Being a mentor
  • How did I get into this?
  • Uniquely annoying
  • AMP vs SMP
  • Reality
  • Static or static
  • May 2010
  • A good day out
  • Hacking your car
  • What sign are you?
  • Using an MMU
  • Books and e-books
  • OS influence on power consumption
  • iPad impressions
  • ESC Silicon Valley 2010
  • April 2010
  • What is my PC for?
  • Introducing MCAPI
  • The perfect day
  • Vintage multi-core – the IPC
  • The 4 Ds principle
  • Vintage multi-core – introduction
  • Out of gas?
  • RTS Embedded Systems 2010
  • Learning a language
  • March 2010
  • USB – what we learned
  • Perfect chocolate
  • Graphical medicine
  • Thinking is important
  • Technology trends
  • Words on the move
  • EW 2010
  • The AIRpod
  • What is “real time”?
  • February 2010
  • A taxi in Munich
  • Staying in line
  • The Names Game
  • Android beyond mobile
  • Avatar
  • Thanks for the memory
  • What is my phone for?
  • Overloading or obfuscation?
  • January 2010
  • Indefinite
  • Product quality: belief or proof?
  • Vacation for the PC repair man
  • Agile revisited
  • Time’s arrow
  • Small or fast?
  • aMAZEd
  • C, C++ and the family tree
  • December 2009
  • A Google mobile lab, anyone?
  • Crystal ball
  • Stand-by or boot-up
  • There, their
  • Agile embedded
  • Innovation as an Asset
  • IP/ESC’09
  • Keeping the lights on
  • November 2009
  • Blocking and non-blocking APIs
  • Cars made from wood
  • Yes!
  • 8 bits anyone?
  • Driving on the right side of the road
  • C is great, but…
  • Droids dropping like bombs
  • Parlez vous Fortran?
  • Palindromic numbers
  • Multi-core, multi-OS confusion
  • October 2009
  • Muffins in Colorado
  • Where does Ford make its paint?
  • Time travel
  • Introducing the iBrush
  • Sex, money and the weather
  • Heap contiguity revisited
  • Going green
  • Simulation – better than the real thing?
  • Mind your language
  • September 2009
  • MMU and heap contiguity
  • What kind of pillows do you want?
  • Creeping elegance
  • Government sponsored charity
  • How many mobile phones?
  • Stand aside! I’m an expert
  • Seeing is believing
  • A plea
  • August 2009
  • volatile
  • Stolen!
  • Multi-core/multi-OS – terminology
  • Feedback
  • The RTOS business – a story
  • Reading aloud
  • ++i or i++?
  • The .99 phenomenon
  • USB: easy, but …
  • July 2009
  • Problem solving
  • Devices that phone home
  • An odd coincidence
  • Y2K redux
  • Mind mapping
  • The Works
  • Best barman in the world
  • When a fancy UI is not a luxury
  • Live music
  • June 2009
  • Language extensions
  • The 8 day week
  • Fix it in the software!
  • How to sell more toothpaste
  • What is an embedded system?
  • Glass blowing
  • There are 10 kinds of people in the world
  • You want a macchiato?
  • Who needs OS source code?
  • May 2009
  • By royal appointment
  • Assembly language is always smallest/fastest – not!
  • Exhibitionism
  • Would you buy a TCP/IP stack from me?
  • Linux in the embedded world
  • How much of a geek am I?
  • Why medical systems need WiFi
  • Boldly splitting infinitives
  • ESC Silicon Valley 2009: The Fashion Parade