I recently faced this issue. Even though I’ve 3 TB of HDD attached with the machines, I used to frequently see the Ambari Agent Disk usage red alerts. It was a production cluster and was supposed to hand over customer and on the same day this error started appearing. Tried googling and searching on Ambari forums but it was merely waste of time. It was a bit scary situation and was under lot of pressure. :).
Sharing the solution here so that others get benefited and can fix it easily without taking pressure 😉
Capacity Used: [59.05%, 18.0 GB], Capacity Total: [30.5 GB], path=/usr/hdp
Capacity Used: [56.50%, 17.3 GB], Capacity Total: [30.5 GB], path=/usr/hdp
Capacity Used: [60.61%, 18.5 GB], Capacity Total: [30.5 GB], path=/usr/hdp
Along with this you may also see this kind of error:
1/1 local-dirs are bad: /hadoop/yarn/local; 1/1 log-dirs are bad: /hadoop/yarn/log
Generally it happens due to YARN Applications while executing the job, generates lot of temporary data and as show above Ambari is taking only /usr/hdp path by default.
Fixing it is relatively easy :
- Create the directories for Log and temporary directory for intermediate data and set the owner of these directories to yarn.
mkdir -p /mnt/datadrive01/hadoop/yarn/local
mkdir -p /mnt/datadrive01/hadoop/yarn/log
(Do for various mounts)
chown yarn:hadoop /mnt/datadrive01/hadoop/yarn/log
chown yarn:hadoop /mnt/datadrive01/hadoop/yarn/local
Change following two properties under Node Manager
yarn.nodemanager.local-dirs = /mnt/datadrive01/hadoop/yarn/local, /mnt/datadrive02/hadoop/yarn/local,/mnt/datadrive03/hadoop/yarn/local,/mnt/datadrive04/hadoop/yarn/local
Restart the affected components and you are done.