HBase学习之路 (六)过滤器详解大数据

过滤器(Filter)

  基础API中的查询操作在面对大量数据的时候是非常苍白的,这里Hbase提供了高级的查询方法:Filter。Filter可以根据簇、列、版本等更多的条件来对数据进行过滤,基于Hbase本身提供的三维有序(主键有序、列有序、版本有序),这些Filter可以高效的完成查询过滤的任务。带有Filter条件的RPC查询请求会把Filter分发到各个RegionServer,是一个服务器端(Server-side)的过滤器,这样也可以降低网络传输的压力。

  要完成一个过滤的操作,至少需要两个参数。一个是抽象的操作符,Hbase提供了枚举类型的变量来表示这些抽象的操作符:LESS/LESS_OR_EQUAL/EQUAL/NOT_EUQAL等;另外一个就是具体的比较器(Comparator),代表具体的比较逻辑,如果可以提高字节级的比较、字符串级的比较等。有了这两个参数,我们就可以清晰的定义筛选的条件,过滤数据。

抽象操作符(比较运算符)

LESS <

LESS_OR_EQUAL <=

EQUAL =

NOT_EQUAL <>

GREATER_OR_EQUAL >=

GREATER >

NO_OP 排除所有

比较器(指定比较机制)

BinaryComparator 按字节索引顺序比较指定字节数组,采用 Bytes.compareTo(byte[])

BinaryPrefixComparator 跟前面相同,只是比较左端的数据是否相同

NullComparator 判断给定的是否为空

BitComparator 按位比较

RegexStringComparator 提供一个正则的比较器,仅支持 EQUAL 和非 EQUAL

SubstringComparator 判断提供的子串是否出现在 value 中

HBase过滤器的分类

比较过滤器

1、行键过滤器 RowFilter

Filter rowFilter = new RowFilter(CompareOp.GREATER, new BinaryComparator("95007".getBytes())); 
scan.setFilter(rowFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18          
19         Filter rowFilter = new RowFilter(CompareOp.GREATER, new BinaryComparator("95007".getBytes())); 
20         scan.setFilter(rowFilter); 
21         ResultScanner resultScanner = table.getScanner(scan); 
22         for(Result result : resultScanner) { 
23             List<Cell> cells = result.listCells(); 
24             for(Cell cell : cells) { 
25                 System.out.println(cell); 
26             } 
27         } 
28          
29          
30     }

运行结果部分截图

HBase学习之路 (六)过滤器详解大数据

2、列簇过滤器 FamilyFilter

Filter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("info".getBytes())); 
scan.setFilter(familyFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18  
19         Filter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("info".getBytes())); 
20         scan.setFilter(familyFilter); 
21         ResultScanner resultScanner = table.getScanner(scan); 
22         for(Result result : resultScanner) { 
23             List<Cell> cells = result.listCells(); 
24             for(Cell cell : cells) { 
25                 System.out.println(cell); 
26             } 
27         } 
28          
29          
30     } 
31      
32      
33 }

HBase学习之路 (六)过滤器详解大数据

3、列过滤器 QualifierFilter

Filter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("name".getBytes())); 
scan.setFilter(qualifierFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18          
19         Filter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("name".getBytes())); 
20         scan.setFilter(qualifierFilter); 
21         ResultScanner resultScanner = table.getScanner(scan); 
22         for(Result result : resultScanner) { 
23             List<Cell> cells = result.listCells(); 
24             for(Cell cell : cells) { 
25                 System.out.println(cell); 
26             } 
27         } 
28          
29          
30     } 
31      
32      
33 }

 HBase学习之路 (六)过滤器详解大数据

4、值过滤器 ValueFilter

Filter valueFilter = new ValueFilter(CompareOp.EQUAL, new SubstringComparator("男")); 
scan.setFilter(valueFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18          
19         Filter valueFilter = new ValueFilter(CompareOp.EQUAL, new SubstringComparator("男")); 
20         scan.setFilter(valueFilter); 
21         ResultScanner resultScanner = table.getScanner(scan); 
22         for(Result result : resultScanner) { 
23             List<Cell> cells = result.listCells(); 
24             for(Cell cell : cells) { 
25                 System.out.println(cell); 
26             } 
27         } 
28          
29          
30     } 
31      
32      
33 }

 HBase学习之路 (六)过滤器详解大数据

5、时间戳过滤器 TimestampsFilter

List<Long> list = new ArrayList<>(); 
list.add(1522469029503l); 
TimestampsFilter timestampsFilter = new TimestampsFilter(list); 
scan.setFilter(timestampsFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18          
19         List<Long> list = new ArrayList<>(); 
20         list.add(1522469029503l); 
21         TimestampsFilter timestampsFilter = new TimestampsFilter(list); 
22         scan.setFilter(timestampsFilter); 
23         ResultScanner resultScanner = table.getScanner(scan); 
24         for(Result result : resultScanner) { 
25             List<Cell> cells = result.listCells(); 
26             for(Cell cell : cells) { 
27                 System.out.println(Bytes.toString(cell.getRow()) + "/t" + Bytes.toString(cell.getFamily()) + "/t" + Bytes.toString(cell.getQualifier()) 
28                 + "/t" + Bytes.toString(cell.getValue()) + "/t" + cell.getTimestamp()); 
29             } 
30         } 
31          
32          
33     } 
34      
35      
36 }

HBase学习之路 (六)过滤器详解大数据

专用过滤器

1、单列值过滤器 SingleColumnValueFilter —-会返回满足条件的整行

SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter( 
                "info".getBytes(), //列簇 
                "name".getBytes(), //列 
                CompareOp.EQUAL,  
                new SubstringComparator("刘晨")); 
//如果不设置为 true,则那些不包含指定 column 的行也会返回 
singleColumnValueFilter.setFilterIfMissing(true); 
scan.setFilter(singleColumnValueFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest2 { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18          
19         SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter( 
20                 "info".getBytes(),  
21                 "name".getBytes(),  
22                 CompareOp.EQUAL,  
23                 new SubstringComparator("刘晨")); 
24         singleColumnValueFilter.setFilterIfMissing(true); 
25          
26         scan.setFilter(singleColumnValueFilter); 
27         ResultScanner resultScanner = table.getScanner(scan); 
28         for(Result result : resultScanner) { 
29             List<Cell> cells = result.listCells(); 
30             for(Cell cell : cells) { 
31                 System.out.println(Bytes.toString(cell.getRow()) + "/t" + Bytes.toString(cell.getFamily()) + "/t" + Bytes.toString(cell.getQualifier()) 
32                 + "/t" + Bytes.toString(cell.getValue()) + "/t" + cell.getTimestamp()); 
33             } 
34         } 
35          
36          
37     } 
38      
39      
40 }

HBase学习之路 (六)过滤器详解大数据

2、单列值排除器 SingleColumnValueExcludeFilter 

SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter( 
                "info".getBytes(),  
                "name".getBytes(),  
                CompareOp.EQUAL,  
                new SubstringComparator("刘晨")); 
singleColumnValueExcludeFilter.setFilterIfMissing(true); 
         
scan.setFilter(singleColumnValueExcludeFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest2 { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18          
19         SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter( 
20                 "info".getBytes(),  
21                 "name".getBytes(),  
22                 CompareOp.EQUAL,  
23                 new SubstringComparator("刘晨")); 
24         singleColumnValueExcludeFilter.setFilterIfMissing(true); 
25          
26         scan.setFilter(singleColumnValueExcludeFilter); 
27         ResultScanner resultScanner = table.getScanner(scan); 
28         for(Result result : resultScanner) { 
29             List<Cell> cells = result.listCells(); 
30             for(Cell cell : cells) { 
31                 System.out.println(Bytes.toString(cell.getRow()) + "/t" + Bytes.toString(cell.getFamily()) + "/t" + Bytes.toString(cell.getQualifier()) 
32                 + "/t" + Bytes.toString(cell.getValue()) + "/t" + cell.getTimestamp()); 
33             } 
34         } 
35          
36          
37     } 
38      
39      
40 }

 HBase学习之路 (六)过滤器详解大数据

3、前缀过滤器 PrefixFilter—-针对行键

PrefixFilter prefixFilter = new PrefixFilter("9501".getBytes()); 
         
scan.setFilter(prefixFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest2 { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18          
19         PrefixFilter prefixFilter = new PrefixFilter("9501".getBytes()); 
20          
21         scan.setFilter(prefixFilter); 
22         ResultScanner resultScanner = table.getScanner(scan); 
23         for(Result result : resultScanner) { 
24             List<Cell> cells = result.listCells(); 
25             for(Cell cell : cells) { 
26                 System.out.println(Bytes.toString(cell.getRow()) + "/t" + Bytes.toString(cell.getFamily()) + "/t" + Bytes.toString(cell.getQualifier()) 
27                 + "/t" + Bytes.toString(cell.getValue()) + "/t" + cell.getTimestamp()); 
28             } 
29         } 
30          
31          
32     } 
33      
34      
35 }

HBase学习之路 (六)过滤器详解大数据

4、列前缀过滤器 ColumnPrefixFilter

ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("name".getBytes()); 
         
scan.setFilter(columnPrefixFilter);
HBase学习之路 (六)过滤器详解大数据

 1 public class HbaseFilterTest2 { 
 2  
 3     private static final String ZK_CONNECT_KEY = "hbase.zookeeper.quorum"; 
 4     private static final String ZK_CONNECT_VALUE = "hadoop1:2181,hadoop2:2181,hadoop3:2181"; 
 5  
 6     private static Connection conn = null; 
 7     private static Admin admin = null; 
 8      
 9     public static void main(String[] args) throws Exception { 
10          
11         Configuration conf = HBaseConfiguration.create(); 
12         conf.set(ZK_CONNECT_KEY, ZK_CONNECT_VALUE); 
13         conn = ConnectionFactory.createConnection(conf); 
14         admin = conn.getAdmin(); 
15         Table table = conn.getTable(TableName.valueOf("student")); 
16          
17         Scan scan = new Scan(); 
18          
19         ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("name".getBytes()); 
20          
21         scan.setFilter(columnPrefixFilter); 
22         ResultScanner resultScanner = table.getScanner(scan); 
23         for(Result result : resultScanner) { 
24             List<Cell> cells = result.listCells(); 
25             for(Cell cell : cells) { 
26                 System.out.println(Bytes.toString(cell.getRow()) + "/t" + Bytes.toString(cell.getFamily()) + "/t" + Bytes.toString(cell.getQualifier()) 
27                 + "/t" + Bytes.toString(cell.getValue()) + "/t" + cell.getTimestamp()); 
28             } 
29         } 
30          
31          
32     } 
33      
34      
35 }

HBase学习之路 (六)过滤器详解大数据

5、分页过滤器 PageFilter

 

原创文章,作者:Maggie-Hunter,如若转载,请注明出处:https://blog.ytso.com/tech/bigdata/9010.html

(0)
上一篇 2021年7月19日 09:04
下一篇 2021年7月19日 09:04

相关推荐

发表回复

登录后才能评论