cluster analysis - Using variable length data inputs with EM algorithm clustering -


we have set of sequences taxi positions. want cluster data considering sequential patterns in data lines. example: t1, t2, t3, t4 travels , a,b,c,d,e set of places. data have like,

  • t1 b c b d
  • t2 a
  • t3 b b b c e d
  • t4 b c d c b d c a

but problem length of data not variable. how can cluster these type of data using em. since not accept variable length data there way can customize it. thanks

em general principle. can use different models.

probably popular model em gaussian mixture modeling, gmm.

naturally, if use covariances, gmm requires fixed dimensionality.

but if use other models, there no reason cannot work variable length vectors. example, there em variants process text data, , text have different length.


Comments

Popular posts from this blog

java - UnknownEntityTypeException: Unable to locate persister (Hibernate 5.0) -

python - ValueError: empty vocabulary; perhaps the documents only contain stop words -

ubuntu - collect2: fatal error: ld terminated with signal 9 [Killed] -