Java 8 Stream：按多属性分组聚合自定义对象-java教程-PHP中文网

java 8 stream：按多属性分组聚合自定义对象

在 Java 开发中，经常需要对对象集合进行分组和聚合操作。当需要根据对象的多个属性进行分组，并对其他属性执行求和等聚合计算时，Java 8 Stream API 提供了强大且灵活的解决方案。本文将详细阐述如何利用 Collectors.groupingBy 结合自定义键和自定义聚合器来实现这一目标。

问题场景

假设我们有一个 Student 类，包含 name、age、city、salary 和 incentive 等属性：

public class Student {
    private String name;
    private int age;
    private String city;
    private double salary;
    private double incentive;

    public Student(String name, int age, String city, double salary, double incentive) {
        this.name = name;
        this.age = age;
        this.city = city;
        this.salary = salary;
        this.incentive = incentive;
    }

    // Getters for all fields
    public String getName() { return name; }
    public int getAge() { return age; }
    public String getCity() { return city; }
    public double getSalary() { return salary; }
    public double getIncentive() { return incentive; }

    // Optional: toString for easy printing
    @Override
    public String toString() {
        return "Student{" +
               "name='" + name + '\'' +
               ", age=" + age +
               ", city='" + city + '\'' +
               ", salary=" + salary +
               ", incentive=" + incentive +
               '}';
    }
}

登录后复制

我们有一个 Student 实例列表，需要根据 name、age 和 city 这三个属性进行分组，然后将每个分组内学生的 salary 和 incentive 进行累加，最终生成一个包含聚合后 Student 对象的列表。

例如，输入数据如下：

立即学习“Java免费学习笔记（深入）”；

Student("Raj",10,"Pune",10000,100)
Student("Raj",10,"Pune",20000,200)
Student("Raj",20,"Pune",10000,100)
Student("Ram",30,"Pune",10000,100)
Student("Ram",30,"Pune",30000,300)
Student("Seema",10,"Pune",10000,100)

登录后复制

期望的输出是：

Student("Raj",10,"Pune",30000,300) // (10000+20000), (100+200)
Student("Raj",20,"Pune",10000,100)
Student("Ram",30,"Pune",40000,400) // (10000+30000), (100+300)
Student("Seema",10,"Pune",10000,100)

登录后复制

解决方案：自定义键与聚合器

为了实现多属性分组和自定义聚合，我们需要两个核心组件：一个自定义键对象来表示分组依据，以及一个自定义聚合器来处理值的累加。

1. 定义自定义分组键 NameAgeCity

由于 Map.Entry 只能包含两个元素，无法直接作为多属性分组的键。一个简洁且易于维护的方法是创建一个新的类来封装所有分组属性。对于 Java 8，我们需要手动实现 equals() 和 hashCode() 方法，以确保 Map 能够正确识别相等的键。

标书对比王

标书对比王是一款标书查重工具，支持多份投标文件两两相互比对，重复内容高亮标记，可快速定位重复内容原文所在位置，并可导出比对报告。

查看详情

import java.util.Objects;

public static class NameAgeCity {
    private String name;
    private int age;
    private String city;

    public NameAgeCity(String name, int age, String city) {
        this.name = name;
        this.age = age;
        this.city = city;
    }

    // Getters
    public String getName() { return name; }
    public int getAge() { return age; }
    public String getCity() { return city; }

    // 静态工厂方法，方便从 Student 对象创建 NameAgeCity 实例
    public static NameAgeCity from(Student s) {
        return new NameAgeCity(s.getName(), s.getAge(), s.getCity());
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        NameAgeCity that = (NameAgeCity) o;
        return age == that.age &&
               Objects.equals(name, that.name) &&
               Objects.equals(city, that.city);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, age, city);
    }
}

登录后复制

注意事项：

对于 Java 16 及更高版本，可以使用 record 关键字更简洁地定义此类，编译器会自动生成 equals()、hashCode() 和 toString() 方法。例如：public record NameAgeCity(String name, int age, String city) {}
equals() 和 hashCode() 的正确实现对于 Map 的正常工作至关重要。

2. 定义自定义聚合器 AggregatedValues

我们需要一个对象来累积每个分组的 salary 和 incentive。这个聚合器将作为 Collector 的中间容器，在流处理过程中进行状态更新。

import java.util.function.Consumer;

public static class AggregatedValues implements Consumer<Student> {
    private String name;
    private int age;
    private String city;
    private double salary;
    private double incentive;

    // Getters
    public String getName() { return name; }
    public int getAge() { return age; }
    public String getCity() { return city; }
    public double getSalary() { return salary; }
    public double getIncentive() { return incentive; }

    // 累加器方法：接收一个 Student 对象并更新聚合状态
    @Override
    public void accept(Student s) {
        // 首次接受时初始化分组键信息
        if (name == null) name = s.getName();
        if (age == 0) age = s.getAge(); // 假设age不会是0，如果可能，需要更严谨的判断
        if (city == null) city = s.getCity();

        // 累加薪资和奖金
        salary += s.getSalary();
        incentive += s.getIncentive();
    }

    // 合并器方法：将另一个 AggregatedValues 对象的状态合并到当前对象
    public AggregatedValues merge(AggregatedValues other) {
        this.salary += other.salary;
        this.incentive += other.incentive;
        return this;
    }

    // 转换方法：将聚合结果转换为 Student 对象
    public Student toStudent() {
        return new Student(name, age, city, salary, incentive);
    }

    // Optional: toString for easy printing
    @Override
    public String toString() {
        return "AggregatedValues{" +
               "name='" + name + '\'' +
               ", age=" + age +
               ", city='" + city + '\'' +
               ", salary=" + salary +
               ", incentive=" + incentive +
               '}';
    }
}

登录后复制

AggregatedValues 的关键点：

它实现了 Consumer<Student> 接口，其 accept 方法用于累加单个 Student 对象的数据。
merge 方法用于在并行流处理时合并不同线程的中间结果。
toStudent 方法是一个转换器，可以将聚合后的数据转换回原始的 Student 类型（如果需要）。

3. 使用 Collectors.groupingBy 和 Collector.of 进行聚合

现在，我们可以结合 Collectors.groupingBy 和 Collector.of 来执行分组和聚合操作。Collector.of 允许我们构建一个自定义的 Collector，它需要四个函数：

supplier (供应器): 创建一个新的结果容器（AggregatedValues 实例）。
accumulator (累加器): 将流中的元素（Student）添加到结果容器中（调用 AggregatedValues::accept）。
combiner (合并器): 合并两个结果容器（调用 AggregatedValues::merge），用于并行流。
finisher (终结器): 对最终结果容器进行转换（调用 AggregatedValues::toStudent），生成最终结果类型。

示例代码：

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.stream.Collectors;

public class StudentAggregator {

    public static void main(String[] args) {
        List<Student> students = new ArrayList<>();
        // 使用 Collections.addAll 兼容 Java 8，Java 9+ 可用 List.of()
        Collections.addAll(students,
            new Student("Raj", 10, "Pune", 10000, 100),
            new Student("Raj", 10, "Pune", 20000, 200),
            new Student("Raj", 20, "Pune", 10000, 100),
            new Student("Ram", 30, "Pune", 10000, 100),
            new Student("Ram", 30, "Pune", 30000, 300),
            new Student("Seema", 10, "Pune", 10000, 100)
        );

        // 方案一：聚合结果为 List<AggregatedValues>
        List<AggregatedValues> aggregatedValuesList = students.stream()
            .collect(Collectors.groupingBy(
                NameAgeCity::from, // keyMapper: 将 Student 映射为 NameAgeCity 作为分组键
                Collectors.of(     // downstream Collector: 自定义聚合器
                    AggregatedValues::new,    // supplier: 创建新的 AggregatedValues 实例
                    AggregatedValues::accept, // accumulator: 将 Student 累加到 AggregatedValues
                    AggregatedValues::merge   // combiner: 合并两个 AggregatedValues
                )
            ))
            .values().stream() // 获取 Map 的值（即 AggregatedValues 列表）
            .collect(Collectors.toList()); // 收集为 List

        System.out.println("--- 聚合结果 (AggregatedValues 类型) ---");
        aggregatedValuesList.forEach(System.out::println);

        System.out.println("\n--- 聚合结果 (Student 类型) ---");
        // 方案二：聚合结果直接转换为 List<Student>
        List<Student> resultStudents = students.stream()
            .collect(Collectors.groupingBy(
                NameAgeCity::from, // keyMapper
                Collectors.of(     // downstream Collector
                    AggregatedValues::new,       // supplier
                    AggregatedValues::accept,    // accumulator
                    AggregatedValues::merge,     // combiner
                    AggregatedValues::toStudent  // finisher: 将 AggregatedValues 转换为 Student
                )
            ))
            .values().stream() // 获取 Map 的值（即 Student 列表）
            .collect(Collectors.toList()); // 收集为 List

        resultStudents.forEach(System.out::println);
    }
}

登录后复制

输出结果：

--- 聚合结果 (AggregatedValues 类型) ---
AggregatedValues{name='Raj', age=20, city='Pune', salary=10000.0, incentive=100.0}
AggregatedValues{name='Raj', age=10, city='Pune', salary=30000.0, incentive=300.0}
AggregatedValues{name='Ram', age=30, city='Pune', salary=40000.0, incentive=400.0}
AggregatedValues{name='Seema', age=10, city='Pune', salary=10000.0, incentive=100.0}

--- 聚合结果 (Student 类型) ---
Student{name='Raj', age=20, city='Pune', salary=10000.0, incentive=100.0}
Student{name='Raj', age=10, city='Pune', salary=30000.0, incentive=300.0}
Student{name='Ram', age=30, city='Pune', salary=40000.0, incentive=400.0}
Student{name='Seema', age=10, city='Pune', salary=10000.0, incentive=100.0}

登录后复制

注意事项与总结

自定义键的 equals() 和 hashCode()： 这是使用自定义对象作为 Map 键的基石。如果未正确实现，groupingBy 将无法正确识别相同的分组。
Collector.of 的灵活性： 它是创建复杂聚合逻辑的强大工具，允许我们完全控制聚合过程的三个阶段（供应、累加、合并）以及最终结果的转换。
可变累加： AggregatedValues 采用可变累加方式，直接修改其内部状态。这通常比创建大量中间不可变对象更高效，尤其是在处理大量数据时。
Java 版本兼容性： 本教程的代码完全兼容 Java 8。对于更高版本的 Java，可以利用 record 简化键对象的定义，以及使用 List.of() 和 stream().toList() 等新特性。
业务逻辑分离： 将聚合逻辑封装在 AggregatedValues 类中，使得代码结构更清晰，易于维护和测试。