04-11-2022 01:04 PM
NI,
We made it through two weekends of competitions this year with no issues and then struggled at the district champs. Before champs, we added a USB camera. We are using Java with VS Code and are not doing anything fancy with the software as we are making subsystems with Falcons or Neos and then creating commands using the subsystems which should all fall under the 'standard practice' category.
All season, we have been using a CANivore plugged into one of the Rio's USB ports. Before district champs, we plugged a camera into the other port. Before arriving at the event, we noticed that deploying code would often result in CAN errors on either the rio's network or on the CANivore's network with the symptom on the CANivore being a red STAT light which has something to do with the USB connection. Restarting the rio or power cycling the robot usually made the problem go away.
At the event, we had CSAs looking at our robot for many hours as we often were not driving on the field. We finally swapped to our spare rio and the FTAs reported that its ethernet port stopped working when the 15-second auto period ended. We then swapped to a loaner rio from spare parts in the pit and removed the USB camera at the same time. We did not see those CAN errors for the rest of the weekend and were able to drive in the last few matches.
I'm suspecting the first rio may have a bad USB port and the other one has a bad ethernet port. What can we do to figure out the problems and make sure the team has a good rio or two to use in future competitions?
Thank you!
04-11-2022 01:38 PM
More information:
The CSA brought a scope the next day after the initial assessment of our problems and verified that both CAN networks were clean in that the signals were true square waves and high voltage levels as expected for a good CAN network.
04-12-2022 10:07 AM
I'm seeing a couple of issues here and I'll do my best to work through each of them individually. I think the easiest way is to ask all of the questions I have for one and then all I have for the other.
Let's start with the Ethernet port:
There are two things the FTA would have available to them in this situation. They can physically look at a disabled robot on the field or they can look at what FMS is telling them is happening. I don't see any way to look at the disabled robot and make this determination. At best, they'd be trying to look at the LEDs to see what's occurring and it sounds like there wasn't communication taking place. FMS would tell them the Driver Station is plugged in and working correctly, the Radio is responding to them, but the roboRIO couldn't be reached. It wouldn't diagnose why. The common answers here are:
From what I understand, you used this roboRIO in several matches prior. You immediately swapped to a spare parts roboRIO after this problem occurred in one match. You didn't see the problem re-appear. Is that correct? (I'm asking a bunch of questions here to get a feel for what we're working with as I want to make sure the path we move forward solves the issue you saw rather than feeling like it resolved it and we later find out we didn't solve it.). It doesn't sound like we've done anything with the original roboRIO here for testing/debugging other than the swap. With the swap, options 2 and 3 become less likely (though, gremlins in code have happened). Is there anything from the console in that match that would have suggested an error and crash (or did we look? If not, it's ok. It just gives us more information if we had seen the console). I can't really offer feedback on option 1 as I don't know if any LEDs were visible. Did the Ethernet port appear to work after the match?
It's possible the physical port is the problem. Though, that means the physical port had to work prior to the match, during pre-start to connect to the field, during auton, stop working physically during teleop, and then regain functionality after the match (assuming it appeared to work afterwards). That explanation is less likely than the three above so I'd want to explore what troubleshooting we've done (including a potential reimage of that roboRIO) before declaring the Ethernet port the problem and moving forward with confidence I couldn't be secure in.
_________________________________
For the scoped CAN network, what was sending the CAN messages across the bus? It sounds like you suspect the USB port was the issue. Did we try anything else in that port? Did we try swapping ports? I imagine so but the information about those wasn't here so it's hard for me to understand what we did before getting into the more unusual troubleshooting steps. Similar to the last issue, have we re-imaged this roboRIO and seen the behavior persist?
04-12-2022 12:08 PM
Thank you for the thorough and thoughtful reply. You are right that there are two issues. Hopefully the comments below add some clarity and answer your questions.
----------
1. Ethernet issue on Rio #2:
This Rio was on the robot for one match. Comms worked in the pit to transfer code by ethernet. The Rio was recently re-imaged by USB to be up-to-date for the season and so it would be more readily available as a backup. (Rio #2 was not the one we had been using all season; we in the last week pulled it out as a just-in-case spare. Now that I think about it, this was the same Rio that had lots of ethernet problems at our 2020 event before covid hit and we just never followed up on it. At that event, we had comm problems and lots of FTAs and CSAs looking at it - they tried loaner VRM and radio modules but our comms were still intermittently problematic.)
I do not know what the FTAs saw in the status lights on the field.
The same wires were used for Rio #2 in this one match that were used on Rio #1 in all the matches leading up to this match as well as were used on Rio #spare-parts-loaner after this match where comms worked.
So, no we did not use this Rio in several matches prior and this was the only time we had this problem at the transition to teleop all season (three events, with the first two events completed 100% with Rio #1).
I do not know if the port worked after the match as we removed Rio #2 and haven't powered it on since. Looking back at this one match and our 2020 experience, I'd say Rio #2 had comm problems in one of every 2 or 3 matches and it happened to be the first match where it was used in 2022.
I do not know what the console was saying but I did save log files for this entire past weekend as well as from Sunday of the previous competition before we had either of the problems.
----------
2. CAN issues on Rio #1:
We have the first CAN network Rio1-SparkMaxes-CanCoders-PDH which uses the CAN terminals on the corner of the Rio. We scoped this network and the CSA said it looked good. This network has CAN ID 8 which was one of the red messages we saw when things stopped working. This network was fine for our first two competitions and we started noticing the errors between those two comps and states last weekend.
We have the second CAN network Rio1(USB port to CANivore)-Falcons-120ohmR. We also scoped this network and the CSA said it looked good. This network also started having errors in the week leading up to last weekend and during last weekend at states. Lots of red messages and red blinking STAT light on the CANivore, often after deploying code but also at seemingly random times. Power cycling (or restarting the Rio from the driver station software) cleared it up most of the time.
I do not know if we tried swapping ports. The problems started when we added a USB camera to the other USB port on Rio #1 in the week leading up to states.
We did re-image Rio #1 on Friday night at states in an effort to overcome the CAN problems. The behavior persisted.
Thank you again!
04-12-2022 12:31 PM
For Ethernet Rio#2, let's use the struggles from 2020 and this season and not focus TOO much on that effort. The issue appears to be intermittent so I'd want to throw out a disclaimer that intermittent means something that seems like a solution can just as easily be one of the times things didn't go wrong. Keep an eye out for that. I'll send an email to the address associated with your forum account to talk about how to get a replacement.
That should make the thread easier as we can just discuss the CAN device. That one also seems more... interesting.
Can you talk me through your theory for the USB port? From my understanding, you have two separate CAN networks, each on their own bus. One goes through the standard CAN connections on the upper left side of the roboRIO. One goes through the CANivore adapter in the USB port. I'm going to list out some things that I'm taking from the posts and I want you to correct me anywhere I'm misunderstanding or add pieces I may be missing. Then we can figure out a plan for that troubleshooting/device. 😃
Does that all sound right?
We haven't swapped USB ports to see if the behavior follows. Though, I'm not sure we'll get a LOT of value there. Was the camera working when plugged in? Am I correct in saying the camera was connected to the second roboRIO? If not, does plugging it into that one break the behavior (or are we avoiding this because we have a working roboRIO and fear we'll have none after a camera connection)? If we unplug the CANivore and comment out that code, do we still see errors? Ideally we'd narrow down the two CAN networks and troubleshoot one at a time to find what we're running into. It sounds like both are struggling which makes me question the USB theory. With both failing simultaneously, it also sounds less like a hardware issue so I'm trying to poke holes and let you tell me where I'm wrong so we can find the best path forward 😃
04-12-2022 01:21 PM
1. yes
2. yes
3. yes
4. yes
5. don't know - need to test - the last step at states was to use the spare parts Rio and at the same time we removed the USB camera and commented out all related code
6. yes, when the USB camera is present
7. yes
8. don't know - we did not try the spare parts Rio with the camera - all other cables did remain the same - at this point the team just wanted to be able to drive for an entire match and they willingly sacrificed the camera which wasn't giving a feed with the pre-spare-parts Rio anyway
I'm not sure whether the team swapped the camera to the other USB port.
The camera worked in the pit and in the shop but did not ever show a live feed during a match (per the drive team).
For the other questions, we'd have to do some testing in the shop. Trying our 3rd and final Rio with the camera (that is the Rio still on our rookie robot from 2019 where we did have a USB camera working all season), unplugging the CANivore (which is running or drivetrain and shooter), etc. The Rio's CAN network only had one device reporting red messages and we haven't had the time to swap the SparkMax out. It has always worked even with the occasional red message for that ID (ID 8). The much bigger failure was the CANivore's STAT light showing red which meant the robot could neither move nor shoot. Interestingly, the CANivore STAT light blinking red is related to USB problems, according to CANivore documentation.
If there is something we can do in the shop to help diagnose, please let us know. It may be the weekend before we are able to do so. (Such as unplug the CANivore and comment out the code or try our last Rio with the USB camera; we will see what we can do about answering some of your standing questions.)
Thank you!
04-12-2022 03:20 PM
Let's try, at a minimum, the other USB port or something to understand what we're seeing there.
It's still strange to me both CAN buses would go down at once if it's the USB. They sound like they should be independent. It's also worth understanding if we remove the camera from the non-working setup if the problem goes away. "If the camera is plugged in" makes it sound like the problem goes away with the camera unplugged. Or, it won't exist until the camera gets plugged in. Those are both strange but getting that characterized versus it being a red herring would be incredibly useful for both of us 😃