| 0 | Questionperception | What are the important objects in the current scene? Those objects will be considered for the future reasoning and driving decision. |
| GT Answer | There is a walking person to the back right of the ego vehicle, a people on the road to the back of the ego vehicle, a white sedan to the back of the ego vehicle, and a barrier gate to the front of the ego vehicle. The IDs of these objects are <c1,CAM_BACK_RIGHT,1083.3,552.5>, <c2,CAM_BACK,684.2,519.2>, <c3,CAM_BACK,827.5,577.5>, and <c4,CAM_FRONT,1465.1,340.9>. |
| InternVL2-2B | In the current scene, the important objects include:
1. **The Car**: The car is the primary object of interest. It is the vehicle in motion, traveling down the road. The car's position and movement are crucial for understanding the driving scenario.
2. **The Road**: The road is the primary means of transportation in this scene. It is the path the car is following, and its condition (wet and slippery) affects the driving conditions.
3. **The Sidewalk and Street**: The sidewalk and the street are essential elements of the road infrastructure. They provide a safe and navigable space for pedestrians and vehicles.
4. **The Traffic Sign**: The traffic sign is an important object, as it indicates the direction of traffic flow and provides information about parking availability. It is located near the car, which helps in understanding the traffic situation.
5. **The Building**: The building in the background is significant as it provides context and location information. It helps in understanding the environment and the surrounding area.
6. **The Pedestrian**: The pedestrian is an important object, as they are moving along the sidewalk. Their presence and movement can affect the traffic flow and safety.
7. **The Streetlight**: The streetlight is an important object, as it provides lighting and safety for the road. Its position and condition can affect visibility and safety.
These objects are essential for understanding the driving scenario and making informed decisions. |
| Mini-InternVL2-2B-DA-DriveLM | There is a white sedan to the back of the ego vehicle, a pedestrian wearing a white shirt to the back right of the ego vehicle, and a yellow and white speed bump to the front of the ego vehicle. The IDs of these objects are 1, 2, and 3. |
| llama_adapter_v2_multimodal7b | In the current scene, the important objects to consider for future reasoning and driving decisions include the green traffic light, the white car, and the building. The green traffic light indicates that the car can proceed, while the white car is approaching the intersection. The building in the background may also be relevant, as it could be a landmark or a point of interest that the driver should be aware of. The presence of these objects will help the driver make informed decisions about their driving path and ensure safe navigation through the intersection. |
| InternVL2-2b-LoRA-25k-drivelm | There is a white sedan to the back of the ego vehicle, a white sedan to the front of the ego vehicle, a white sedan to the front of the ego vehicle, and a white sedan to the front of the ego vehicle. The IDs of these objects are <c1,CAM_BACK,1080.0,525.8>, <c2,CAM_FRONT,1080.0,505.8>, <c3,CAM_FRONT,1080.0,505.8>, and <c4,CAM_FRONT,1080.0,505.8>. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | There is a pedestrian wearing a red top and black pants to the back right of the ego vehicle, a white sedan to the back of the ego vehicle, a white sedan to the back of the ego vehicle, and a red and white sign to the front of the ego vehicle. The IDs of these objects are <c1,CAM_BACK_RIGHT,1083.3,552.5>, <c2,CAM_BACK,684.2,519.2>, <c3,CAM_BACK,827.5,577.5>, and <c4,CAM_FRONT,1465.1,340.9>. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | There is a white sedan to the back of the ego vehicle, a white sedan to the front of the ego vehicle, and a white sedan to the front of the ego vehicle. The IDs of these objects are <c1,CAM_BACK,1080.0,525.0>, <c2,CAM_FRONT,1080.0,525.0>, and <c3,CAM_FRONT,1080.0,525.0>. |
| InternVL2-2b-LoRA-300k-drivelm | There is a white sedan to the back of the ego vehicle, a white sedan to the front of the ego vehicle, a white sedan to the back of the ego vehicle, and a white sedan to the front of the ego vehicle. The IDs of these objects are <c1,CAM_BACK,1080.8,545.0>, <c2,CAM_FRONT,1080.8,505.8>, <c3,CAM_BACK,1080.8,520.8>, and <c4,CAM_FRONT,1080.8,505.8>. |
| llava-v1.6-mistral-7b-hf | In the current scene, the important objects to consider for future reasoning and driving decisions include:
1. The road ahead, which appears to be clear and free of obstacles.
2. The traffic light, which is currently green, indicating that it is safe to proceed.
3. The pedestrian crossing, which is not currently in use but should be considered for potential pedestrians.
4. The speed limit sign, which provides information on the maximum allowed speed.
5. The car in front, which is currently driving on the road.
6. The parking lot sign, which indicates the availability of parking spaces.
7. The building in the background, which might be a point of interest or a destination.
8. The sky, which could provide information about the weather conditions.
These objects are crucial for understanding the current state of the road and making informed decisions about driving. |
| 1 | Questionperception | What is the moving status of object <c1,CAM_BACK_RIGHT,1083.3,552.5>? Please select the correct answer from the following options: A. Back up. B. Stopped. C. Reverse parking. D. Going ahead. |
| GT Answer | D |
| InternVL2-2B | The moving status of object <c1,CAM_BACK_RIGHT,1083.3,552.5> is B. Stopped. |
| Mini-InternVL2-2B-DA-DriveLM | D |
| llama_adapter_v2_multimodal7b | The moving status of object <c1,CAM_BACK_RIGHT,1083.3,552.5> is "Reverse parking." |
| InternVL2-2b-LoRA-25k-drivelm | D |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | D |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | D |
| InternVL2-2b-LoRA-300k-drivelm | B. Stopped. |
| llava-v1.6-mistral-7b-hf | B. Stopped |
| 2 | Questionprediction | What object should the ego vehicle notice first when the ego vehicle is getting to the next possible location? What is the state of the object that is first noticed by the ego vehicle and what action should the ego vehicle take? What object should the ego vehicle notice second when the ego vehicle is getting to the next possible location? What is the state of the object perceived by the ego vehicle as second and what action should the ego vehicle take? What object should the ego vehicle notice third? What is the state of the object perceived by the ego vehicle as third and what action should the ego vehicle take? |
| GT Answer | Firstly notice that <c4,CAM_FRONT,1465.1,340.9>, the state of which is a traffic sign, so the ego vehicle should slow down and proceed. Secondly, notice that <c3,CAM_BACK,827.5,577.5>, the state of which is going ahead, so the ego vehicle should continue going ahead at the same speed. Thirdly, notice that <c2,CAM_BACK,684.2,519.2>, the state of which is going ahead, so the ego vehicle should keep going ahead at the same speed. |
| InternVL2-2B | When the ego vehicle is getting to the next possible location, it should first notice the "NUS" sign, which indicates the National University of Singapore. The state of the object perceived by the ego vehicle as the first object to be noticed is the "NUS" sign. The action the ego vehicle should take is to proceed straight ahead.
When the ego vehicle is getting to the next possible location, it should notice the "CAM_FRONT_LEFT" sign, which indicates the direction to the left. The state of the object perceived by the ego vehicle as the second object to be noticed is the "CAM_FRONT_LEFT" sign. The action the ego vehicle should take is to proceed straight ahead.
When the ego vehicle is getting to the next possible location, it should notice the "CAM_FRONT_RIGHT" sign, which indicates the direction to the right. The state of the object perceived by the ego vehicle as the third object to be noticed is the "CAM_FRONT_RIGHT" sign. The action the ego vehicle should take is to proceed straight ahead. |
| Mini-InternVL2-2B-DA-DriveLM | Firstly, notice that the object is a traffic sign. The ego vehicle should continue ahead at the same speed. Secondly, notice that the object is stationary. The ego vehicle should continue ahead at the same speed. Thirdly, notice that the object is going ahead. The ego vehicle should continue ahead at the same speed. |
| llama_adapter_v2_multimodal7b | When the ego vehicle is getting to the next possible location, it should first notice the green traffic light. The ego vehicle should then take the appropriate action, which is to stop at the red stop sign. Next, the ego vehicle should notice the white car in front of it. The ego vehicle should then take the appropriate action, which is to proceed cautiously, maintaining a safe distance from the white car. Finally, the ego vehicle should notice the blue car on the side of the road. The ego vehicle should then take the appropriate action, which is to proceed with caution, being aware of the blue car's presence and any potential hazards or obstacles on the road. |
| InternVL2-2b-LoRA-25k-drivelm | Firstly, notice that <c3,CAM_FRONT,1080.8,525.8>. The object is going ahead, so the ego vehicle should keep going ahead at the same speed. Secondly, notice that <c2,CAM_FRONT,1080.8,500.8>. The object is going ahead, so the ego vehicle should keep going ahead at the same speed. Thirdly, notice that <c1,CAM_BACK,1080.8,525.8>. The object is going ahead, so the ego vehicle should keep going ahead at the same speed. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | Firstly, notice <c4,CAM_FRONT,1465.1,340.9>. The object is a traffic sign, so the ego vehicle should continue ahead at the same speed. Secondly, notice <c3,CAM_BACK,827.5,577.5>. The object is going ahead, so the ego vehicle should continue ahead at the same speed. Thirdly, notice <c1,CAM_BACK_RIGHT,1083.3,552.5>. The object is a traffic sign, so the ego vehicle should continue ahead at the same speed. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | Firstly, notice <c1, CAM_FRONT, 1200.0, 600.0>. The object is approaching, so the ego should slow down. Secondly, notice <c2, CAM_FRONT, 1200.0, 600.0>. The object is approaching, so the ego should slow down. Thirdly, notice <c3, CAM_FRONT, 1200.0, 600.0>. The object is approaching, so the ego should slow down. |
| InternVL2-2b-LoRA-300k-drivelm | Firstly, notice <c3,CAM_FRONT,1080.0,520.0>. It is going ahead, so the ego vehicle should keep going ahead at the same speed. Secondly, notice <c2,CAM_FRONT,1080.0,505.0>. It is going ahead, so the ego vehicle should keep going ahead at the same speed. Thirdly, notice <c1,CAM_BACK,1010.0,540.8>. It is going ahead, so the ego vehicle should keep going ahead at the same speed. |
| llava-v1.6-mistral-7b-hf | The image you've provided appears to be a collage of different camera angles from a vehicle's perspective, showing various scenes that the ego vehicle might encounter. Here's a breakdown of what the ego vehicle should notice and what actions it should take:
1. First object: The ego vehicle should notice the traffic light first when it's getting to the next possible location. The state of the traffic light is red, indicating that the ego vehicle should stop.
2. Second object: The ego vehicle should notice the pedestrian crossing in the second image. The pedestrian crossing is clear, so the ego vehicle should proceed with caution, checking for any pedestrians who might be crossing.
3. Third object: The third image shows a car park. The ego vehicle should notice the car park and check for any vehicles entering or leaving the parking lot. It should also be aware of any pedestrians who might be walking near the parking lot.
4. Fourth object: The fourth image shows a car park with a person standing near the entrance. The ego vehicle should notice the person and slow down or stop if necessary to ensure the person's safety.
5. Fifth object: The fifth image shows a car park with a person standing near the entrance. The ego vehicle should notice the person and slow down or stop if necessary to ensure the person's safety.
In each case, the ego vehicle should be prepared to react to the situation it encounters, whether that's stopping at a red light, yielding to pedestrians, or being cautious in a car park. |
| 3 | Questionprediction | Is <c1,CAM_BACK_RIGHT,1083.3,552.5> a traffic sign or a road barrier? |
| GT Answer | No. |
| InternVL2-2B | To determine if the coordinates <c1,CAM_BACK_RIGHT,1083.3,552.5> represent a traffic sign or a road barrier, let's analyze the image in detail:
1. **Location and Orientation**:
- The coordinates are given in the format (x, y, z), where x, y, and z are the coordinates of the point relative to the image's origin.
- The coordinates are specified as (1083.3, 552.5).
2. **Visual Inspection**:
- **Traffic Sign**: A traffic sign typically has a distinct shape, color, and text that clearly indicates a specific rule or regulation.
- **Road Barrier**: A road barrier is usually a physical structure that obstructs or restricts the passage of vehicles.
3. **Image Analysis**:
- **Cam1**: The image shows a grassy area with a building and a road. There is no clear indication of a traffic sign or road barrier.
- **Cam2**: The image shows a road with a sign on the right side, which is a typical feature of a road barrier.
- **Cam3**: The image shows a road with a sign on the right side, which is a typical feature of a road barrier.
- **Cam4**: The image shows a road with a sign on the right side, which is a typical feature of a road barrier.
- **Cam5**: The image shows a road with a sign on the right side, which is a typical feature of a road barrier.
- **Cam6**: The image shows a road with a sign on the right side, which is a typical feature of a road barrier.
- **Cam7**: The image shows a road with a sign on the right side, which is a typical feature of a road barrier.
- **Cam8**: The image shows a road with a sign on the right side, which is a typical feature of a road barrier.
- **Cam9**: The image shows a road with a sign on the right side, which is a typical feature of a road barrier.
4. **Conclusion**:
- The coordinates <c1,CAM_BACK_RIGHT,1083.3,552.5> are located on the right side of the road, and the image shows a sign on the right side of the road. This is consistent with the typical appearance of a road barrier.
**Conclusion |
| Mini-InternVL2-2B-DA-DriveLM | No. |
| llama_adapter_v2_multimodal7b | The image does not provide enough information to confidently determine whether the <c1,CAM_BACK_RIGHT,1083.3,552.5> is a traffic sign or a road barrier. It could be either, depending on the context and the specific location within the image. |
| InternVL2-2b-LoRA-25k-drivelm | No. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | No. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | No. |
| InternVL2-2b-LoRA-300k-drivelm | No. |
| llava-v1.6-mistral-7b-hf | The image you've provided appears to show a traffic sign rather than a road barrier. The sign is displaying the message "CAM_BACK_RIGHT," which is likely a reference to a specific camera angle or direction for a surveillance or traffic monitoring system. It's not a physical barrier that would be used to block traffic. |
| 4 | Questionplanning | What actions could the ego vehicle take based on <c1,CAM_BACK_RIGHT,1083.3,552.5>? Why take this action and what's the probability? |
| GT Answer | The action is to keep going at the same speed. The reason is that there is no safety issue, and the probability is high. |
| InternVL2-2B | The ego vehicle, indicated by the car in the center of the image, is approaching a parking lot. Based on the given information, the actions it could take are:
1. **Enter the Parking Lot**: The ego vehicle is currently on the right side of the road, and the parking lot sign is visible. The probability of entering the parking lot is high, as the sign is clearly visible and the road is clear of any other vehicles. This action is likely to be taken if the ego vehicle intends to park in the lot.
2. **Turn Left**: The ego vehicle is currently on the right side of the road, and there is a visible turn lane on the left side. The probability of turning left is relatively low, as the turn lane is already occupied by another vehicle. However, if the ego vehicle intends to turn left, it might be necessary to navigate around the parked car or to find an empty lane.
3. **Exit the Parking Lot**: The ego vehicle is currently on the right side of the road, and the parking lot sign is visible. The probability of exiting the parking lot is high, as the sign is clearly visible and the road is clear of any other vehicles. This action is likely to be taken if the ego vehicle intends to leave the parking lot.
4. **Turn Right**: The ego vehicle is currently on the right side of the road, and there is a visible turn lane on the left side. The probability of turning right is relatively low, as the turn lane is already occupied by another vehicle. However, if the ego vehicle intends to turn right, it might be necessary to navigate around the parked car or to find an empty lane.
5. **Exit the Parking Lot**: The ego vehicle is currently on the right side of the road, and the parking lot sign is visible. The probability of exiting the parking lot is high, as the sign is clearly visible and the road is clear of any other vehicles. This action is likely to be taken if the ego vehicle intends to leave the parking lot.
In summary, the ego vehicle is likely to enter the parking lot if it intends to park, turn left if it intends to turn left, exit the parking lot if it intends to leave, and turn right if it intends to turn right. The probability of each action is relatively high, as the parking lot sign is clearly visible and the road is clear of any other vehicles. |
| Mini-InternVL2-2B-DA-DriveLM | The action is to keep going at the same speed. The reason is that there is no safety issue. The probability is high. |
| llama_adapter_v2_multimodal7b | Based on the given coordinates, the ego vehicle is at the intersection of a street and a parking lot. The vehicle could take several actions, such as:1. Continue straight ahead: The vehicle could continue driving straight ahead on the street, following the road rules and traffic signals. This action would be appropriate if the vehicle is not turning or entering the parking lot.2. Turn right: The vehicle could turn right into the parking lot, which would be a safe and legal maneuver if the parking lot is designated for turning.3. Turn left: The vehicle could turn left into the parking lot, which would be a safe and legal maneuver if the parking lot is designated for turning.4. Enter the parking lot: The vehicle could enter the parking lot, which would be a safe and legal maneuver if the parking lot is designated for entering.The probability of the vehicle taking any of these actions depends on the specific location and road conditions, as well as the vehicle's driver's intentions and the presence of other vehicles or pedestrians in the area. |
| InternVL2-2b-LoRA-25k-drivelm | The action is to keep going at the same speed. The reason is that there is no safety issue, and the probability is high. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | The action is to keep going at the same speed. The reason is that there is no safety issue, and the probability is high. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | The action is to keep going at the same speed. The reason is that there is no safety issue, and the probability is high. |
| InternVL2-2b-LoRA-300k-drivelm | The action is to keep going at the same speed. The reason is that there is no safety issue. The probability is high. |
| llava-v1.6-mistral-7b-hf | The image you've provided appears to be a sequence of four images showing a vehicle's perspective at different points in time, likely from a camera mounted on the back of the vehicle. The images are labeled with the following descriptions:
1. CAM_FRONT_LEFT
2. CAM_BACK_LEFT
3. CAM_BACK_RIGHT
4. CAM_FRONT_RIGHT
Based on the labeling, the vehicle is likely a car, and the images are taken from the perspective of the car's rear camera.
The action that the ego vehicle could take based on the labeling is to reverse or back up. This is because the vehicle is in a situation where it is facing away from the camera, and the rear camera is showing the view behind the vehicle.
The probability of the vehicle needing to reverse or back up depends on the context of the situation. If the vehicle is in a parking lot or a space where it is parked and needs to back out, the probability is high. If the vehicle is on a road and needs to reverse to avoid a collision or to correct a wrong turn, the probability is lower but still possible.
The exact probability would depend on the specific circumstances of the situation, which are not visible in the image. However, given the labels and the fact that the vehicle is facing away from the camera, it is more likely that the vehicle is in a situation where it needs to reverse or back up. |
| 5 | Questionplanning | What actions taken by the ego vehicle can lead to a collision with <c1,CAM_BACK_RIGHT,1083.3,552.5>? |
| GT Answer | No such action will lead to a collision. |
| InternVL2-2B | The ego vehicle, a white sedan, is approaching a road with a sign indicating "CAM_FRONT_RIGHT" and a speed limit of 50 km/h. The sign also mentions "CAR PARK 13" and "PARKING LOTS AVAILABLE."
Here are the actions the ego vehicle can take to avoid a collision with the sign:
1. **Slow Down**: The ego vehicle should reduce its speed to comply with the speed limit of 50 km/h. This will allow it to react more quickly to any potential changes in the road conditions.
2. **Check for Oncoming Vehicles**: The ego vehicle should maintain a safe distance from any vehicles that might be approaching from the opposite direction. It should also be prepared to stop if necessary.
3. **Observe the Sign**: The ego vehicle should closely observe the sign, especially the "CAM_FRONT_RIGHT" and "PARKING LOTS AVAILABLE" information. It should be aware of the proximity of the sign and any potential hazards it might encounter.
4. **Avoid Overreaching**: The ego vehicle should avoid overreaching the sign, as this could lead to a collision. It should maintain a safe distance from the sign and be prepared to stop if necessary.
5. **Use Lane Lines**: The ego vehicle should use the lane lines on the road to maintain its position and avoid any potential collisions with the sign.
By taking these actions, the ego vehicle can reduce the risk of a collision with the sign and ensure a safe passage through the area. |
| Mini-InternVL2-2B-DA-DriveLM | No such action will lead to a collision. |
| llama_adapter_v2_multimodal7b | The ego vehicle is driving down the street, and it is approaching a green traffic light. If the driver does not stop at the green light, as required by traffic rules, and continues to drive forward, they may collide with the <c1,CAM_BACK_RIGHT,1083.3,552.5> object. This collision could result in damage to the vehicle, injury to the passengers, or even more severe consequences if the vehicle is traveling at a high speed or if there are other vehicles or pedestrians nearby. It is crucial for drivers to follow traffic rules and be aware of their surroundings to avoid such incidents. |
| InternVL2-2b-LoRA-25k-drivelm | No such action will lead to a collision. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | No such action will lead to a collision. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | No such action will lead to a collision. |
| InternVL2-2b-LoRA-300k-drivelm | No such action will lead to a collision. |
| llava-v1.6-mistral-7b-hf | The image you've provided shows a sequence of four photos, each depicting a different view of a vehicle's rear camera (CAM_BACK_RIGHT) and the road ahead. The actions that could lead to a collision with the vehicle in the rear camera are:
1. **Failure to Yield**: In the first photo, the vehicle is approaching an intersection where it should yield to oncoming traffic. If the driver fails to yield, they could collide with the vehicle in the rear camera.
2. **Inadequate Following Distance**: In the second photo, the vehicle is following another vehicle too closely. If the vehicle in front suddenly stops or slows down, the driver may not have enough time to react and avoid a collision.
3. **Distracted Driving**: In the third photo, the driver is looking away from the road. This distraction could lead to a loss of situational awareness, increasing the risk of a collision.
4. **Speeding**: In the fourth photo, the vehicle is speeding towards the intersection. If the driver does not slow down in time, they could collide with the vehicle in the rear camera.
5. **Lack of Attention to Road Signs**: In the fourth photo, there is a sign indicating a parking lot ahead. If the driver does not notice this sign and continues to speed, they could enter the parking lot at an unsafe speed, potentially causing a collision.
6. **Inability to Stop**: In the fourth photo, the vehicle is shown to be speeding. If the driver does not notice the parking lot sign and does not have enough time to stop, they could collide with the vehicle in the rear camera.
7. **Poor Road Conditions**: If the road is wet, icy, or has any other hazardous conditions, the driver may have difficulty stopping or maneuvering the vehicle, increasing the risk of a collision.
8. **Mechanical Failure**: If the vehicle's brakes or other safety systems fail, the driver may not be able to stop in time to avoid a collision.
9. **Impaired Driving**: If the driver is under the influence of alcohol or drugs, their reaction time and decision-making abilities may be impaired, increasing the risk of a collision.
10. **Fatigue**: If the driver is fatigued, they may not be |
| 6 | Questionplanning | In this scenario, what are safe actions to take for the ego vehicle? |
| GT Answer | Brake gently to a stop or decelerate gradually without braking. |
| InternVL2-2B | Based on the image, the safe actions for the ego vehicle would include maintaining a safe following distance from the vehicle in front, as indicated by the "CAM_FRONT_LEFT" and "CAM_FRONT_RIGHT" overlays. The vehicle should also be prepared to stop if necessary, as suggested by the "CAM_FRONT" overlay. Additionally, the vehicle should be cautious of pedestrians and other vehicles, as indicated by the "CAM_FRONT" overlay. The presence of the "CAM_FRONT_LEFT" and "CAM_FRONT_RIGHT" overlays suggests that the vehicle is being monitored for its position relative to the other vehicles, which may require the driver to be vigilant and aware of their surroundings. |
| Mini-InternVL2-2B-DA-DriveLM | Keep going at the same speed, decelerate gradually without braking. |
| llama_adapter_v2_multimodal7b | In the image, there are multiple vehicles, including a car and a truck, driving on a street with a green traffic light. To ensure safety, the ego vehicle should follow the traffic light signals and maintain a safe distance from other vehicles, especially the truck. The driver should also be aware of the truck's blind spots and avoid sudden movements or lane changes that could cause an accident. Additionally, the driver should pay attention to other vehicles' signals and maintain a safe speed, especially when approaching intersections or merging lanes. By following these safe actions, the ego vehicle can minimize the risk of accidents and ensure a smooth and secure driving experience. |
| InternVL2-2b-LoRA-25k-drivelm | Keep going at the same speed, accelerate, and proceed ahead. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | Keep going at the same speed, decelerate gradually without braking. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | Keep going at the same speed, decelerate gradually without braking. |
| InternVL2-2b-LoRA-300k-drivelm | Keep going at the same speed, decelerate gradually without braking. |
| llava-v1.6-mistral-7b-hf | In the scenario depicted in the image, the ego vehicle is driving on a road with various traffic signs and signals. Here are some safe actions the ego vehicle should take:
1. **Follow Traffic Signs**: The ego vehicle should obey all traffic signs, including speed limits, stop signs, and yield signs.
2. **Respect Traffic Signals**: The vehicle should stop at red lights and proceed with caution at green lights.
3. **Maintain Safe Distance**: The ego vehicle should maintain a safe distance from other vehicles, especially when driving on a busy road or in a parking lot.
4. **Be Aware of Pedestrians**: The vehicle should be cautious around pedestrians, especially when making turns or entering/exiting parking lots.
5. **Use Turn Signals**: When turning, the vehicle should use the appropriate turn signals to indicate its intentions to other road users.
6. **Stay in the Lane**: The ego vehicle should stay within its lane and avoid drifting into other lanes or onto the shoulder.
7. **Check Blind Spots**: Before changing lanes or making turns, the driver should check their blind spots to ensure no other vehicles are in their path.
8. **Use Mirrors**: Regularly checking mirrors can help the driver stay aware of their surroundings and other vehicles.
9. **Slow Down in Parking Lots**: The vehicle should reduce its speed when driving in parking lots, as they are often shared with pedestrians and other vehicles.
10. **Be Prepared for Reversing**: In parking lots, the vehicle may need to reverse to exit a space. The driver should be prepared to reverse when necessary.
By following these safety guidelines, the ego vehicle can navigate the road safely and avoid potential accidents. |
| 7 | Questionbehavior | Predict the behavior of the ego vehicle. Please select the correct answer from the following options: A. The ego vehicle is going straight. The ego vehicle is driving slowly. B. The ego vehicle is slightly steering to the left. The ego vehicle is driving very fast. C. The ego vehicle is going straight. The ego vehicle is not moving. D. The ego vehicle is steering to the left. The ego vehicle is driving with normal speed. |
| GT Answer | A |
| InternVL2-2B | Based on the image, the ego vehicle is driving straight. The road markings and the presence of a pedestrian suggest that the vehicle is on a designated path, indicating that it is not turning or changing lanes. The vehicle is also not moving very fast or slowly, as there is no indication of speed control or other signs of driving at a high speed. Therefore, the correct answer is:
C. The ego vehicle is going straight. The ego vehicle is not moving. |
| Mini-InternVL2-2B-DA-DriveLM | A |
| llama_adapter_v2_multimodal7b | The correct answer is B. The ego vehicle is driving slowly. |
| InternVL2-2b-LoRA-25k-drivelm | C |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | A |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | C |
| InternVL2-2b-LoRA-300k-drivelm | C. The ego vehicle is going straight. The ego vehicle is not moving. |
| llava-v1.6-mistral-7b-hf | A. The ego vehicle is going straight. The ego vehicle is driving slowly. |