| 0 | Questionperception | What are the important objects in the current scene? Those objects will be considered for the future reasoning and driving decision. |
| GT Answer | There is a black SUV to the front of the ego vehicle, a white truck to the back left of the ego vehicle, a white SUV to the back of the ego vehicle, an orange truck to the back of the ego vehicle, a yield to pedestrians to the front right of the ego vehicle, and a green light to the front of the ego vehicle. The IDs of these objects are <c1,CAM_FRONT,1092.5,538.3>, <c2,CAM_BACK_LEFT,822.9,391.2>, <c3,CAM_BACK,660.8,485.8>, <c4,CAM_BACK,848.3,461.7>, <c5,CAM_FRONT_RIGHT,725.7,452.2>, and <c6,CAM_FRONT,672.9,106.4>. |
| InternVL2-2B | In the current scene, the important objects include:
1. **Truck**: The white truck with the "TAYLOR LOBSTERS" logo is a critical object. It is likely a delivery truck, as indicated by the text on the side. The truck is moving in the direction of the camera, which is towards the right.
2. **Pedestrian Crossing Sign**: The yellow pedestrian crossing sign is also important. It indicates that the road is a pedestrian zone, which is crucial for safety, especially during inclement weather.
3. **Road and Vehicles**: The road itself is wet, indicating recent rain, which is important for driving safety. The presence of other vehicles, including a black car and a truck, suggests that the road is in use.
4. **Buildings and Infrastructure**: The buildings and infrastructure, such as the pedestrian crossing sign and the road markings, are important for navigation and safety. The pedestrian crossing sign is particularly crucial for ensuring that pedestrians can safely cross the road.
5. **Weather Conditions**: The overcast sky and wet road conditions are important for driving. They can affect visibility and traction, making it crucial to be cautious and adjust driving accordingly.
These objects are essential for the future reasoning and driving decisions, as they provide critical information about the road conditions, traffic, and pedestrian safety. |
| Mini-InternVL2-2B-DA-DriveLM | There is a white truck to the back of the ego vehicle, a black SUV to the front of the ego vehicle, a white truck to the back left of the ego vehicle, a black SUV to the front of the ego vehicle, and a green light to the front of the ego vehicle. The IDs of these objects are 1, 2, 3, 4, and 5. |
| llama_adapter_v2_multimodal7b | In the current scene, the important objects to consider for future reasoning and driving decisions include the truck, the car, the traffic light, the stop sign, and the building. The truck and car are both on the road, and the traffic light and stop sign are present to regulate traffic flow. The building in the background may also be relevant, as it could be a landmark or a point of interest that drivers should be aware of. Considering these objects will help drivers make informed decisions about their driving behavior and ensure safety on the road. |
| InternVL2-2b-LoRA-25k-drivelm | There is a black car to the front of the ego vehicle, a white truck to the back left of the ego vehicle, a black car to the front of the ego vehicle, and a white truck to the back of the ego vehicle. The IDs of these objects are <c1,CAM_FRONT,1080.0,495.8>, <c2,CAM_BACK_LEFT,1080.0,525.0>, <c3,CAM_FRONT,1080.0,495.8>, and <c4,CAM_BACK,1080.0,495.8>. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | There is a black car to the front of the ego vehicle, a white truck to the back left of the ego vehicle, a white truck to the back of the ego vehicle, a white truck to the back of the ego vehicle, a yellow sign to the front right of the ego vehicle, and a green light to the front of the ego vehicle. The IDs of these objects are <c1,CAM_FRONT,1092.5,538.3>, <c2,CAM_BACK_LEFT,822.9,391.2>, <c3,CAM_BACK,655.0,461.7>, <c4,CAM_BACK,848.3,461.7>, <c5,CAM_FRONT_RIGHT,725.7,452.2>, and <c6,CAM_FRONT,672.9,106.4>. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | There is a black car to the front of the ego vehicle, a white truck to the back of the ego vehicle, a black sedan to the front of the ego vehicle, and a white truck to the front of the ego vehicle. The IDs of these objects are <c1,CAM_FRONT,1010.0,495.8>, <c2,CAM_BACK,1010.0,530.0>, <c3,CAM_FRONT,1010.0,505.8>, and <c4,CAM_FRONT,1010.0,505.8>. |
| InternVL2-2b-LoRA-300k-drivelm | There is a black car to the front of the ego vehicle, a black car to the back of the ego vehicle, a black car to the front of the ego vehicle, a white truck to the front left of the ego vehicle, and a white truck to the back left of the ego vehicle. The IDs of these objects are <c1,CAM_FRONT,1010.8,495.8>, <c2,CAM_BACK,1010.8,520.8>, <c3,CAM_FRONT,1010.8,490.0>, <c4,CAM_FRONT_LEFT,1010.8,480.0>, and <c5,CAM_BACK_LEFT,1010.8,490.0>. |
| llava-v1.6-mistral-7b-hf | In the current scene, the important objects to consider for future reasoning and driving decisions include:
1. Traffic lights: There are multiple traffic lights visible in the image, which are crucial for understanding the right of way and when to stop or proceed.
2. Pedestrian crossing signs: There are pedestrian crossing signs indicating where pedestrians might be crossing the street.
3. Vehicles: There are several vehicles, including cars and trucks, which are relevant for understanding traffic flow and potential obstacles.
4. Road markings: The road has clear markings, including lane dividers and crosswalks, which are important for maintaining proper lane positioning and ensuring safe crossing points.
5. Buildings and infrastructure: The presence of buildings and infrastructure, such as the overpass and the mall, can provide context for the location and the type of area the driver is in.
6. Weather conditions: The wet road surface suggests that the weather is rainy, which can affect visibility, road grip, and overall driving conditions.
7. Pedestrians: There are a few pedestrians visible in the image, which are important for anticipating potential pedestrian movements.
8. Traffic signals: There are traffic signals visible in the image, which are crucial for understanding the right of way and when to stop or proceed.
9. Signage: There are various signs visible in the image, including a "CAM BACK" sign, which can provide information about the traffic flow and potential hazards.
10. Road conditions: The wet road surface indicates that the road conditions might be slippery, which can affect vehicle handling and braking.
These objects are important for drivers to consider when making decisions about their driving actions, such as when to stop, when to proceed, and how to navigate the road safely. |
| 1 | Questionperception | What is the moving status of object <c2,CAM_BACK_LEFT,822.9,391.2>? Please select the correct answer from the following options: A. Drive backward. B. Stopped. C. Going ahead. D. Turn right. |
| GT Answer | C |
| InternVL2-2B | The moving status of object <c2,CAM_BACK_LEFT,822.9,391.2> is B. Stopped. |
| Mini-InternVL2-2B-DA-DriveLM | C |
| llama_adapter_v2_multimodal7b | The moving status of object <c2,CAM_BACK_LEFT,822.9,391.2> is that it is driving backward. |
| InternVL2-2b-LoRA-25k-drivelm | C |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | C |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | C |
| InternVL2-2b-LoRA-300k-drivelm | B. Stopped. |
| llava-v1.6-mistral-7b-hf | B. Stopped |
| 2 | Questionprediction | What object should the ego vehicle notice first when the ego vehicle is getting to the next possible location? What is the state of the object that is first noticed by the ego vehicle and what action should the ego vehicle take? What object should the ego vehicle notice second when the ego vehicle is getting to the next possible location? What is the state of the object perceived by the ego vehicle as second and what action should the ego vehicle take? What object should the ego vehicle notice third? What is the state of the object perceived by the ego vehicle as third and what action should the ego vehicle take? |
| GT Answer | Firstly, notice <c5,CAM_FRONT_RIGHT,725.7,452.2>. It is a traffic sign, so the ego vehicle should slow down and proceed. Secondly, notice <c6,CAM_FRONT,672.9,106.4>. It is also a traffic sign, so the ego vehicle should continue straight at the same speed. Thirdly, notice <c2,CAM_BACK_LEFT,822.9,391.2>. It is moving straight ahead, so the ego vehicle should continue straight at the same speed. |
| InternVL2-2B | When the ego vehicle is getting to the next possible location, it should notice the yellow pedestrian crossing sign first. The state of the object perceived by the ego vehicle as the first object to be noticed is that the pedestrian crossing sign is illuminated. The action the ego vehicle should take is to proceed with caution and ensure the safety of pedestrians. The second object to be noticed is the white truck with the "TAYLOR LOBSTER" logo. The state of the object perceived by the ego vehicle as the second object to be noticed is that the truck is stationary. The action the ego vehicle should take is to proceed with caution and ensure the safety of pedestrians and other vehicles. The third object to be noticed is the "SEA PORT WORLD CENTER" sign. The state of the object perceived by the ego vehicle as the third object to be noticed is that the sign is illuminated. The action the ego vehicle should take is to proceed with caution and ensure the safety of pedestrians and other vehicles. |
| Mini-InternVL2-2B-DA-DriveLM | Firstly, notice that the object in front of the ego vehicle is stationary. The ego vehicle should continue ahead at the same speed. Secondly, notice that the object in front of the ego vehicle is stationary. The ego vehicle should continue ahead at the same speed. Thirdly, notice that the object in front of the ego vehicle is stationary. The ego vehicle should continue ahead at the same speed. |
| llama_adapter_v2_multimodal7b | When the ego vehicle is getting to the next possible location, it should first notice the truck in front of it. The ego vehicle should then take appropriate action, such as slowing down or changing lanes if necessary, to maintain a safe distance from the truck. Next, the ego vehicle should notice the traffic light. The ego vehicle should then proceed through the intersection when the light turns green. Finally, the ego vehicle should notice the building and the sidewalk. The ego vehicle should then take appropriate action, such as checking for pedestrians or cyclists, and proceed cautiously. |
| InternVL2-2b-LoRA-25k-drivelm | Firstly, notice that <c3,CAM_FRONT,1080.8,525.0>. The object is a traffic sign, so the ego vehicle should continue ahead at the same speed. Secondly, notice that <c2,CAM_FRONT,1080.8,505.0>. The object is going ahead, so the ego vehicle should continue ahead at the same speed. Thirdly, notice that <c1,CAM_FRONT,1080.8,505.0>. The object is going ahead, so the ego vehicle should continue ahead at the same speed. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | Firstly, notice that <c6,CAM_FRONT,672.9,106.4>. The object is a traffic sign, so the ego vehicle should continue going ahead at the same speed. Secondly, notice that <c1,CAM_FRONT,1092.5,538.3>. The object is stationary, so the ego vehicle should continue going ahead at the same speed. Thirdly, notice that <c5,CAM_FRONT_RIGHT,725.7,452.2>. The object is a traffic sign, so the ego vehicle should continue going ahead at the same speed. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | Firstly, notice <c1, CAM_FRONT, 1200.0, 530.0>. The object is approaching, so the ego should slow down. Secondly, notice <c2, CAM_FRONT, 1200.0, 530.0>. The object is approaching, so the ego should slow down. Thirdly, notice <c3, CAM_FRONT, 1200.0, 530.0>. The object is approaching, so the ego should slow down. |
| InternVL2-2b-LoRA-300k-drivelm | Firstly, notice <c3,CAM_FRONT,1080.0,500.8>. It is a traffic sign, so the ego vehicle should remain stationary. Secondly, notice <c2,CAM_FRONT,1080.0,505.8>. It is stationary, so the ego vehicle should remain stationary. Thirdly, notice <c1,CAM_FRONT,1080.0,505.8>. It is stationary, so the ego vehicle should remain stationary. |
| llava-v1.6-mistral-7b-hf | The image you've provided appears to be a collage of different street scenes, each with its own set of objects and conditions. Here's a description of the objects and their states in each scene:
1. The first object the ego vehicle should notice is the traffic light. It is currently red, indicating that the ego vehicle should stop.
2. The second object is the pedestrian crossing sign. It is lit up, suggesting that pedestrians have the right of way and the ego vehicle should yield to them.
3. The third object is the traffic light. It is still red, so the ego vehicle should continue to stop.
4. The fourth object is the pedestrian crossing sign. It is still lit up, so the ego vehicle should continue to yield to pedestrians.
5. The fifth object is the traffic light. It is still red, so the ego vehicle should continue to stop.
6. The sixth object is the pedestrian crossing sign. It is still lit up, so the ego vehicle should continue to yield to pedestrians.
7. The seventh object is the traffic light. It is still red, so the ego vehicle should continue to stop.
8. The eighth object is the pedestrian crossing sign. It is still lit up, so the ego vehicle should continue to yield to pedestrians.
9. The ninth object is the traffic light. It is still red, so the ego vehicle should continue to stop.
10. The tenth object is the pedestrian crossing sign. It is still lit up, so the ego vehicle should continue to yield to pedestrians.
In each of these scenes, the ego vehicle should stop at the red traffic light and yield to pedestrians at the lit pedestrian crossing sign. It should also be aware of other vehicles and pedestrians in the vicinity and follow traffic rules accordingly. |
| 3 | Questionprediction | Is <c1,CAM_FRONT,1092.5,538.3> a traffic sign or a road barrier? |
| GT Answer | No. |
| InternVL2-2B | To determine if the coordinates <c1,CAM_FRONT,1092.5,538.3> represent a traffic sign or a road barrier, let's analyze the image closely.
1. **Location Analysis**:
- The coordinates are given in the format of (x, y, z), where x, y, and z are the coordinates of the point in the image.
- The coordinates provided are (1092.5, 538.3).
2. **Visual Inspection**:
- **Traffic Sign**: A traffic sign typically has a distinct shape and color, which is not evident in this image.
- **Road Barrier**: A road barrier is usually rectangular or rectangular-like in shape, and it is usually placed at the edge of the road or near the road.
3. **Contextual Clues**:
- The coordinates are located in the middle of the image, not at the edge or near the road.
- The surrounding area appears to be a street with buildings and vehicles, which is typical of a road environment.
4. **Conclusion**:
- Given the coordinates are not at the edge of the image and the surrounding context, it is more likely that this is a road barrier rather than a traffic sign.
Thus, the coordinates <c1,CAM_FRONT,1092.5,538.3> represent a road barrier. |
| Mini-InternVL2-2B-DA-DriveLM | No. |
| llama_adapter_v2_multimodal7b | The image does not provide enough information to confidently determine whether the object is a traffic sign or a road barrier. However, it is clear that the object is located on the side of the road and is related to traffic or road safety. |
| InternVL2-2b-LoRA-25k-drivelm | No. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | No. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | No. |
| InternVL2-2b-LoRA-300k-drivelm | No. |
| llava-v1.6-mistral-7b-hf | The image you've provided appears to show a traffic sign rather than a road barrier. The sign is a pedestrian crossing sign, which is used to indicate the presence of a pedestrian crossing area. It is typically placed at the entrance to a pedestrian crossing to alert drivers to slow down and be prepared to stop for pedestrians. |
| 4 | Questionplanning | What actions could the ego vehicle take based on <c1,CAM_FRONT,1092.5,538.3>? Why take this action and what's the probability? |
| GT Answer | The action is to keep going at the same speed. The reason is that there is no safety issue. The probability is high. |
| InternVL2-2B | The ego vehicle, identified as the white truck, is currently positioned at the intersection of a two-way street. Based on the given sensor data, the vehicle is facing a pedestrian crossing sign, which indicates that the vehicle is approaching a pedestrian crossing. The pedestrian crossing sign is yellow with a black figure of a person, indicating that it is a warning for pedestrians to be cautious.
The ego vehicle should proceed with caution to avoid hitting a pedestrian. The probability of this action is high, as the intersection is a pedestrian crossing, and the vehicle is approaching it from the opposite direction. The pedestrian crossing sign is clearly visible, and the vehicle is in close proximity to it.
To avoid hitting a pedestrian, the ego vehicle should slow down and proceed with caution, ensuring that it does not cross the pedestrian crossing. This action is necessary to ensure the safety of pedestrians and to prevent any potential accidents. |
| Mini-InternVL2-2B-DA-DriveLM | The action is to keep going at the same speed. The reason is that there is no safety issue. The probability is high. |
| llama_adapter_v2_multimodal7b | Based on the coordinates <c1,CAM_FRONT,1092.5,538.3>, the ego vehicle is likely to take the action of turning right. The coordinates indicate that the vehicle is in the middle of a right turn, and the probability of the vehicle taking this action is high. The vehicle is likely turning to navigate through the city street, possibly to avoid an obstacle or to reach its intended destination. |
| InternVL2-2b-LoRA-25k-drivelm | The action is to keep going at the same speed. The reason is that there is no safety issue, and the probability is high. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | The action is to keep going at the same speed. The reason is that there is no safety issue, and the probability is high. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | The action is to keep going at the same speed. The reason is that there is no safety issue, and the probability is high. |
| InternVL2-2b-LoRA-300k-drivelm | The action is to keep going at the same speed. The reason is that there is no safety issue, and the probability is high. |
| llava-v1.6-mistral-7b-hf | The image you've provided appears to be a sequence of images showing a vehicle's perspective as it moves through an urban environment. The sequence includes a front-facing camera view, a left-facing camera view, and a back-facing camera view.
Based on the first image, which shows the front view of the vehicle, the ego vehicle could take several actions:
1. **Slow down or stop**: The vehicle is approaching a stop sign, which indicates that it should come to a complete stop. The probability of this action is high, as the stop sign is clearly visible and the vehicle is approaching it.
2. **Yield to pedestrians**: There are pedestrians visible in the image, and the vehicle should yield to them as they cross the street. The probability of this action is moderate, as the pedestrians are not directly in the vehicle's path but are close to the intersection.
3. **Prepare to turn left**: The vehicle is approaching a left turn, so it should prepare to turn. The probability of this action is high, as the turn is indicated by the road markings and the vehicle's position.
4. **Follow traffic rules**: The vehicle should follow all traffic rules, including obeying the stop sign, yielding to pedestrians, and making the left turn as indicated by the road markings.
In the subsequent images, the vehicle is already making the left turn, so the actions would be to continue the turn, ensuring it's safe and legal to do so. The probability of these actions is high, as the vehicle is already in the process of making the turn.
It's important to note that these actions are based on the visible elements in the image and the standard rules of the road. Actual driving decisions should be made by a human driver, taking into account the current traffic conditions, road signs, and the behavior of other road users. |
| 5 | Questionplanning | What actions taken by the ego vehicle can lead to a collision with <c1,CAM_FRONT,1092.5,538.3>? |
| GT Answer | Slight right turn. |
| InternVL2-2B | The ego vehicle, a white truck, is positioned on the right side of the road, approaching a pedestrian crossing. The pedestrian crossing is marked with a yellow sign indicating a "yield" or "slow down" zone. The truck is approaching the pedestrian crossing at a speed that could potentially exceed the speed limit, which is 25 km/h in this area.
Here are the actions that could lead to a collision with the pedestrian crossing:
1. **Speed Over 25 km/h**: The truck is currently traveling at 25 km/h, which is well below the speed limit. If the truck continues to accelerate, it could exceed the speed limit and collide with the pedestrian crossing.
2. **Lack of Pedestrian Crossing Alert**: The truck is approaching the pedestrian crossing without any visible pedestrian alert or signal. Pedestrians are expected to use the crosswalk signals to indicate when it is safe to cross. If the truck does not see the pedestrian signal, it could collide with them.
3. **Speeding Too Fast**: The truck is traveling at a speed that could be too fast for the pedestrian crossing. Pedestrians are generally cautious and slow down when approaching a crossing, especially when there is a vehicle in the way.
4. **Lack of Visibility**: The truck is not visible to the pedestrian crossing, which could lead to a collision if the pedestrian is not paying attention to the truck's presence.
5. **Driving Carelessly**: The truck is driving in a way that could be seen as careless or reckless. For example, the truck is not paying attention to the pedestrian crossing, or it is driving too fast and not stopping in time to avoid a collision.
In summary, the truck's speed, lack of pedestrian crossing alert, speeding too fast, lack of visibility, and driving carelessly are all potential actions that could lead to a collision with the pedestrian crossing. |
| Mini-InternVL2-2B-DA-DriveLM | Slight left turn. |
| llama_adapter_v2_multimodal7b | The ego vehicle is driving down a city street, and it is approaching a red traffic light. If the driver does not stop or slow down in time, the vehicle may collide with the traffic light. This can happen if the driver is not paying attention to the traffic light or if they are distracted by something else, such as using a cell phone or focusing on other vehicles on the road. It is crucial for drivers to obey traffic signals and maintain a safe distance from other vehicles and obstacles to avoid collisions and ensure road safety. |
| InternVL2-2b-LoRA-25k-drivelm | Accelerating and going straight. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | Changing to the right lane. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | Accelerating and going straight. |
| InternVL2-2b-LoRA-300k-drivelm | Accelerating and going straight. |
| llava-v1.6-mistral-7b-hf | The image you've provided shows a sequence of actions that can lead to a collision with the vehicle labeled as "CAM_FRONT" in the image. Here's a breakdown of the actions shown:
1. **CAM_FRONT_LEFT**: The vehicle is turning left at an intersection.
2. **CAM_FRONT_BACK**: The vehicle is backing up, which means it is moving in reverse.
3. **CAM_FRONT_RIGHT**: The vehicle is turning right at an intersection.
4. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
5. **CAM_FRONT_LEFT**: The vehicle is turning left at an intersection.
6. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
7. **CAM_FRONT_RIGHT**: The vehicle is turning right at an intersection.
8. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
9. **CAM_FRONT_LEFT**: The vehicle is turning left at an intersection.
10. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
11. **CAM_FRONT_RIGHT**: The vehicle is turning right at an intersection.
12. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
13. **CAM_FRONT_LEFT**: The vehicle is turning left at an intersection.
14. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
15. **CAM_FRONT_RIGHT**: The vehicle is turning right at an intersection.
16. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
17. **CAM_FRONT_LEFT**: The vehicle is turning left at an intersection.
18. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
19. **CAM_FRONT_RIGHT**: The vehicle is turning right at an intersection.
20. **CAM_FRONT_BACK**: The vehicle is backing up, again in reverse.
21. **CAM_FRONT_LEFT**: The vehicle is turning left at an intersection.
2 |
| 6 | Questionplanning | In this scenario, what are safe actions to take for the ego vehicle? |
| GT Answer | Keep going at the same speed, decelerate gradually without braking. |
| InternVL2-2B | In this scenario, the safe actions for the ego vehicle would include maintaining a safe following distance from the truck, ensuring that the truck is in the lane it is supposed to be in, and not overtaking the truck. The ego vehicle should also be prepared to stop if necessary, as the truck is in the opposite lane and may be changing lanes or preparing to do so. Additionally, the ego vehicle should be aware of the truck's presence and potential actions, such as braking or changing lanes, to avoid any potential collisions. |
| Mini-InternVL2-2B-DA-DriveLM | Keep going at the same speed, decelerate gradually without braking. |
| llama_adapter_v2_multimodal7b | In the image, a truck is driving down a city street with several other vehicles, including cars and a bus. To ensure safety, the ego vehicle should follow these actions:1. Maintain a safe distance from other vehicles, especially the truck, to avoid collisions.2. Be cautious when changing lanes or making turns, as the truck's size and blind spots can make it difficult to see other vehicles.3. Be prepared for sudden stops or slowdowns, as the truck may need more time to react to traffic conditions.4. Keep an eye on the truck's brake lights and turn signals, as these can provide valuable information about the truck's intentions.5. Be patient and allow the truck to merge or change lanes, as it may take more time to complete the maneuver due to its size.By following these safe actions, the ego vehicle can navigate the city street with other vehicles, including the truck, more effectively and reduce the risk of accidents. |
| InternVL2-2b-LoRA-25k-drivelm | Keep going at the same speed, accelerate, and proceed. |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | Keep going at the same speed, decelerate gradually without braking. |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | Keep going at the same speed, decelerate gradually without braking. |
| InternVL2-2b-LoRA-300k-drivelm | Keep going at the same speed, decelerate gradually without braking. |
| llava-v1.6-mistral-7b-hf | In the image you provided, it appears to be a series of photos showing a vehicle's perspective while driving through an urban area with various traffic conditions. Here are some safe actions that the ego vehicle should take in each scenario:
1. **CAM_BACK_LEFT**: The vehicle should maintain a safe following distance from the vehicle in front of it. This is important to avoid collisions and to have enough time to react to any sudden changes in traffic.
2. **CAM_BACK_RIGHT**: The vehicle should be aware of the traffic conditions behind it, including the presence of pedestrians and cyclists. It should also be prepared to change lanes if necessary, but only when it is safe to do so.
3. **CAM_FRONT_LEFT**: The vehicle should be cautious of the pedestrians and cyclists in the area, especially if they are crossing the street or riding on the sidewalk. It should yield to them when required and be prepared to stop if necessary.
4. **CAM_FRONT_RIGHT**: The vehicle should be aware of the traffic lights and signs, obeying them to ensure safe navigation through the area. It should also be prepared for the possibility of pedestrians crossing the street.
5. **CAM_BACK_LEFT**: The vehicle should be prepared to stop if the traffic light turns red or if there is an obstruction ahead. It should also be aware of the pedestrians and cyclists in the area.
6. **CAM_BACK_RIGHT**: The vehicle should be cautious of the pedestrians and cyclists in the area, especially if they are crossing the street or riding on the sidewalk. It should yield to them when required and be prepared to stop if necessary.
7. **CAM_FRONT_LEFT**: The vehicle should be aware of the traffic lights and signs, obeying them to ensure safe navigation through the area. It should also be prepared for the possibility of pedestrians crossing the street.
8. **CAM_FRONT_RIGHT**: The vehicle should be aware of the traffic lights and signs, obeying them to ensure safe navigation through the area. It should also be prepared for the possibility of pedestrians crossing the street.
In all scenarios, the vehicle should be attentive to its surroundings, maintain a safe speed, and be prepared to react to any unexpected changes in traffic or pedestrian activity. |
| 7 | Questionbehavior | Predict the behavior of the ego vehicle. Please select the correct answer from the following options: A. The ego vehicle is going straight. The ego vehicle is driving with normal speed. B. The ego vehicle is steering to the right. The ego vehicle is driving very fast. C. The ego vehicle is going straight. The ego vehicle is driving slowly. D. The ego vehicle is steering to the left. The ego vehicle is driving fast. |
| GT Answer | A |
| InternVL2-2B | Based on the image, the ego vehicle is driving on a wet road, which suggests that the road conditions are slippery. Given this, the most appropriate behavior for the ego vehicle would be to slow down to ensure safe driving on the wet road.
Therefore, the correct answer is:
C. The ego vehicle is going straight. The ego vehicle is driving slowly. |
| Mini-InternVL2-2B-DA-DriveLM | A |
| llama_adapter_v2_multimodal7b | The ego vehicle is going straight, driving with normal speed. |
| InternVL2-2b-LoRA-25k-drivelm | A |
| InternVL2-2b-LoRA-25k-drivelm-offline-redcircle-ctag-bkgd | A |
| InternVL2-2b-LoRA-25k_plus_DL-PL-10pct | C |
| InternVL2-2b-LoRA-300k-drivelm | C. The ego vehicle is going straight. The ego vehicle is driving slowly. |
| llava-v1.6-mistral-7b-hf | A. The ego vehicle is going straight. The ego vehicle is driving with normal speed. |