Input path does not exist: file:……………………………./pigsample_1406502801_1378470046724

Hi guys,

Again one more issue which is very specific to cygwin + PIG.

You may see Input path does not exist <some path>/pigsampe_somenumber. on the cygwin while doing “ORDER BY” clause. It took some time for me to figure out it was due to ORDER BY clause.

Commonly you may see the stacktrace like this :

2013-09-06 17:50:52,110 [Thread-118] WARN org.apache.hadoop.mapred.LocalJobRunner – job_local_0008
java.lang.RuntimeException: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/E:/<directory from grunt started>/pigsample_1406502801_1378470046724

at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.setConf(WeightedRangePartitioner.java:157)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:214)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/E:/<directory from grunt started>/pigsample_1406502801_1378470046724

Solution :

You can use

A2 = foreach A1 {

A3 = ORDER A0 by fieldName;

GENERATE $0, $1…….

}

Advertisements

Tagged: ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: