Skip to content Skip to main navigation Skip to footer

Age Estimation (AGE)

Phonexia Age Estimation (AGE) estimates the age of a speaker from audio recording or voiceprint.

Technology

  • Trained with emphasis on spontaneous telephony conversation
  • The technology is language-, accent-, text-, and channel- independent
  • Compatibility with the widest range of audio sources possible (applies channel compensation techniques): GSM/CDMA, 3G, VoIP, landlines, etc.

Input

  • Audio: WAV or RAW (8 or 16 bits linear coding),
    A-law or Mu-law, PCM, 8kHz+ sampling
  • Voiceprints: AGE L4 model supports SID4 L4 voiceprints; legacy AGE models support voiceprints created by AGE itself

Output

  • Log file with processed information (age estimate)

Processing speed

Approx. 20x faster than real-time processing on 1 CPU core
i.e. standard 8 CPU core server processes 3,840 hours of audio in 1 day of computing time

Representation of the results: 

For the CMD version 

Name_of_the_file.wav Age[integer – limited to 99] example/david_1.wav 41
example/david_2.wav 40

For the SPE version 

name – representing the age
score – representing the score for the age [1/0]

In order to get a result, each age receives a score; when the score equals to “1”, it represents the value of the age estimated by the system.

{
      "result": {
          "version": 2,
          "name": "AgeEstimationResult",
          "file": "/kelly_2.wav",
          "model": "L",
          "channel_scores": [ 
             {                  "channel": 0,
                   "scores": [
                       {
                           "name": "0",
                           "score": 0
                       },
                       {
                           "name": "1",
                           "score": 0
                       },
   . . .                     {
                           "name": "41",
                           "score": 1
                       },
                       {
                           "name": "42",
                           "score": 0
                       },
 . . .

In order to achieve the most representative results possible, a span of +/- 10 years should be added to the results.

Related Articles